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Preface to the Third Edition 


Students of mathematics and computer science often have trouble the first 
time they’re asked to work seriously with mathematical proofs, because 
they don’t know the “rules of the game.” What is expected of you if you are 
asked to prove something? What distinguishes a correct proof from an 
incorrect one? This book is intended to help students learn the answers to 
these questions by spelling out the underlying principles involved in the 
construction of proofs. 

Many students get their first exposure to mathematical proofs in a high 
school course on geometry. Unfortunately, students in high school geometry 
are usually taught to think of a proof as a numbered list of statements and 
reasons, a view of proofs that is too restrictive to be very useful. There is a 
parallel with computer science here that can be instructive. Early 
programming languages encouraged a similar restrictive view of computer 
programs as numbered lists of instructions. Now computer scientists have 
moved away from such languages and teach programming by using 
languages that encourage an approach called “structured programming.” 
The discussion of proofs in this book is inspired by the belief that many of 
the considerations that have led computer scientists to embrace the 
structured approach to programming apply to proof writing as well. You 
might say that this book teaches “structured proving.” 

In structured programming, a computer program is constructed, not by 
listing instructions one after another, but by combining certain basic 
structures such as the if-else construct and do-while loop of the Java 
programming language. These structures are combined, not only by listing 
them one after another, but also by nesting one within another. For example, 
a program constructed by nesting an if-else construct within a do-while loop 
would look like this: 


do 


if [condition] 
[List of instructions goes here. ] 
else 
[Alternative list of instructions goes here. ] 
while [condition] 


The indenting in this program outline is not absolutely necessary, but it is a 
convenient method often used in computer science to display the underlying 
structure of a program. 

Mathematical proofs are also constructed by combining certain basic 
proof structures. For example, a proof of a statement of the form “if P then 
Q” often uses what might be called the “suppose-until” structure: we 
suppose that P is true until we are able to reach the conclusion that Q is 
true, at which point we retract this supposition and conclude that the 
statement “if P then Q” is true. Another example is the “for arbitrary x 
prove” structure: to prove a statement of the form “for all x, P(x),” we 
declare x to be an arbitrary object and then prove P (x). Once we reach the 
conclusion that P(x) is true we retract the declaration of x as arbitrary and 
conclude that the statement “for all x, P(x)” is true. Furthermore, to prove 
more complex statements these structures are often combined, not only by 
listing one after another, but also by nesting one within another. For 
example, to prove a statement of the form “for all x, if P(x) then Q(x)” we 
would probably nest a “suppose-until” structure within a “for arbitrary x 
prove” structure, getting a proof of this form: 


Let x be arbitrary. 
Suppose P(x) is true. 
[Proof of Q(x) goes here. | 
Thus, if P(x) then Q(x). 
Thus, for all x, if P(x) then Q(x). 


As before, we have used indenting to make the underlying structure of the 
proof clear. 

Of course, mathematicians don’t ordinarily write their proofs in this 
indented form. Our aim in this book is to teach students to write proofs in 
ordinary paragraphs, just as mathematicians do, and not in the indented 
form. Nevertheless, our approach is based on the belief that if students are 


to succeed at writing such proofs, they must understand the underlying 
structure that proofs have. They must learn, for example, that sentences like 
“Let x be arbitrary” and “Suppose P” are not isolated steps in proofs, but 
are used to introduce the “for arbitrary x prove” and “suppose-until” proof 
structures. It is not uncommon for beginning students to use these sentences 
inappropriately in other ways. Such mistakes are analogous to the 
programming error of using a “do” with no matching “while.” 

Note that in our examples, the choice of proof structure is guided by the 
logical form of the statement being proven. For this reason, the book begins 
with elementary logic to familiarize students with the various forms that 
mathematical statements take. Chapter 1 discusses logical connectives, and 
quantifiers are introduced in Chapter 2. These chapters also present the 
basics of set theory, because it is an important subject that is used in the rest 
of the book (and throughout mathematics), and also because it serves to 
illustrate many of the points of logic discussed in these chapters. 

Chapter 3 covers structured proving techniques in a systematic way, 
running through the various forms that mathematical statements can take 
and discussing the proof structures appropriate for each form. The examples 
of proofs in this chapter are for the most part chosen, not for their 
mathematical content, but for the proof structures they illustrate. This is 
especially true early in the chapter, when only a few proof techniques have 
been discussed, and as a result many of the proofs in this part of the chapter 
are rather trivial. As the chapter progresses, the proofs get more 
sophisticated and more interesting, mathematically. 

Chapters 4 and 5, on relations and functions, serve two purposes. First, 
they provide subject matter on which students can practice the proof- 
writing techniques from Chapter 3. And second, they introduce students to 
some fundamental concepts used in all branches of mathematics. 

Chapter 6 is devoted to a method of proof that is very important in both 
mathematics and computer science: mathematical induction. The 
presentation builds on the techniques from Chapter 3, which students 
should have mastered by this point in the book. 

After completing Chapter 6, students should be ready to tackle more 
substantial mathematical topics. Two such topics are presented in Chapters 
7 and 8. Chapter 7, new in this third edition, gives an introduction to 
number theory, and Chapter 8 discusses infinite cardinalities. These 


chapters give students more practice with mathematical proofs, and they 
also provide a glimpse of what more advanced mathematics is like. 

Every section of every chapter ends with a list of exercises. Some 
exercises are marked with an asterisk; solutions or hints for these exercises 
are given in the appendix. Exercises marked with the symbol ’D can be done 
using Proof Designer software, which is available free on the internet. 

The biggest changes in this third edition are the addition of a new chapter 
on number theory and also more than 150 additional exercises. The section 
on reflexive, symmetric, and transitive closures of relations has been 
deleted from Chapter 4 (although these topics are now introduced in some 
exercises in Section 4.4); it has been replaced with a new section in Chapter 
5 on closures of sets under functions. There are also numerous small 
changes throughout the text. 

I would like to thank all those who sent me comments about earlier 
editions of this book. In particular, John Corcoran and Raymond Boute 
made several helpful suggestions. I am also grateful for advice from 
Jonathan Sands and several anonymous reviewers. 


Introduction 


What is mathematics? High school mathematics is concerned mostly with 
solving equations and computing answers to numerical questions. College 
mathematics deals with a wider variety of questions, involving not only 
numbers, but also sets, functions, and other mathematical objects. What ties 
them together is the use of deductive reasoning to find the answers to 
questions. When you solve an equation for x you are using the information 
given by the equation to deduce what the value of x must be. Similarly, 
when mathematicians solve other kinds of mathematical problems, they 
always justify their conclusions with deductive reasoning. 

Deductive reasoning in mathematics is usually presented in the form of a 
proof. One of the main purposes of this book is to help you develop your 
mathematical reasoning ability in general, and in particular your ability to 
read and write proofs. In later chapters we’ll study how proofs are 
constructed in detail, but first let’s take a look at a few examples of proofs. 

Don’t worry if you have trouble understanding these proofs. They’re just 
intended to give you a taste of what mathematical proofs are like. In some 
cases you may be able to follow many of the steps of the proof, but you 
may be puzzled about why the steps are combined in the way they are, or 
how anyone could have thought of the proof. If so, we ask you to be patient. 
Many of these questions will be answered later in this book, particularly in 
Chapter 3. 

All of our examples of proofs in this introduction will involve prime 
numbers. Recall that an integer larger than 1 is said to be prime if it cannot 
be written as a product of two smaller positive integers. If it can be written 
as a product of two smaller positive integers, then it is composite. For 
example, 6 is a composite number, since 6 = 2 - 3, but 7 is a prime number. 

Before we can give an example of a proof involving prime numbers, we 
need to find something to prove — some fact about prime numbers whose 
correctness can be verified with a proof. Sometimes you can find interesting 


patterns in mathematics just by trying out a calculation on a few numbers. 
For example, consider the table in Figure I.1. For each integer n from 2 to 
10, the table shows whether or not both n and 2” — 1 are prime, and a 
surprising pattern emerges. It appears that 2” — 1 is prime in precisely those 
cases in which n is prime! 


n Is n prime? 2" — | Is 2" — 1 prime? 
2 yes 3 yes 

3 yes 7 yes 

+ no:4=2-2 15 no: 15= 3.5 

5 yes 31 yes 

6 no:6=2.3 63 no: 63 =7-9 

7 yes 127 yes 

8 no:8=2.4 255 no: 255 = 15 . 17 
9 no:9=3.3 511 no: 511 = 7 -73 
10 no: 10 =2.5 1023 no: 1023 = 31 - 33 


Figure I.1. 


Will this pattern continue? It is tempting to guess that it will, but this is 
only a guess. Mathematicians call such guesses conjectures. Thus, we have 
the following two conjectures: 


Conjecture 1. Suppose n is an integer larger than 1 and n is prime. Then 2” 
— 1 is prime. 


Conjecture 2. Suppose n is an integer larger than 1 and n is not prime. 
Then 2” — 1 is not prime. 


Unfortunately, if we continue the table in Figure I.1, we immediately find 
that Conjecture 1 is incorrect. It is easy to check that 11 is prime, but 2!! - 
1 = 2047 = 23-89, so 2!! — 1 is composite. Thus, 11 is a counterexample to 
Conjecture 1. The existence of even one counterexample establishes that the 
conjecture is incorrect, but it is interesting to note that in this case there are 
many counterexamples. If we continue checking numbers up to 30, we find 
two more counterexamples to Conjecture 1: both 23 and 29 are prime, but 
273 — 1 = 8,388,607 = 47 - 178,481 and 279 — 1 = 536,870,911 = 2,089 - 
256,999. However, no number up to 30 is a counterexample to Conjecture 
2. 


Do you think that Conjecture 2 is correct? Having found 
counterexamples to Conjecture 1, we know that this conjecture is incorrect, 
but our failure to find a counterexample to Conjecture 2 does not show that 
it is correct. Perhaps there are counterexamples, but the smallest one is 
larger than 30. Continuing to check examples might uncover a 
counterexample, or, if it doesn’t, it might increase our confidence in the 
conjecture. But we can never be sure that the conjecture is correct if we 
only check examples. No matter how many examples we check, there is 
always the possibility that the next one will be the first counterexample. 
The only way we can be sure that Conjecture 2 is correct is to prove it. 


In fact, Conjecture 2 is correct. Here is a proof of the conjecture: 


Proof of Conjecture 2. Since n is not prime, there are positive integers a and 
b such that a < n, b < n, and n = ab. Let x = 2? - 1 and y = 1 + 2P + 2% +.. 
- + 20- Db Then 


xy = (2? =- 1) (1+ Ë +2” +... + 2-H) 
—_ D ab 32b... y(a—l)by _ ¥ ab 2b on >(a—1)b 
=2 (L422 +--+ +2 =L FE Fp ) 


gab =j 


n E l 


Since b < n, we can conclude that x = 2° - 1 < 2” — 1. Also, since ab = n 

> a, it follows that b > 1. Therefore, x = 2? - 1 > 2! - 1 = 1, so y < xy = 2” 

— 1. Thus, we have shown that 2” — 1 can be written as the product of two 

positive integers x and y, both of which are smaller than 2” — 1, so 2” — 1 is 
not prime. 

L] 


Now that the conjecture has been proven, we can Call it a theorem. Don’t 
worry if you find the proof somewhat mysterious. We’ll return to it again at 
the end of Chapter 3 to analyze how it was constructed. For the moment, 
the most important point to understand is that if n is any integer larger than 
1 that can be written as a product of two smaller positive integers a and b, 
then the proof gives a method (admittedly, a somewhat mysterious one) of 
writing 2” — 1 as a product of two smaller positive integers x and y. Thus, if 
n is not prime, then 2” — 1 must also not be prime. For example, suppose n 


= 12, so 2” — 1 = 4095. Since 12 = 3: 4, we could take a = 3 and b= 4 in 
the proof. Then according to the formulas for x and y given in the proof, we 
would have x = 2} - 1=24-1=15andy=1+25+22)+---+2@- pb -= 
1 + 24 + 28 = 273. And, just as the formulas in the proof predict, we have xy 
= 15 - 273 = 4095 = 2” — 1. Of course, there are other ways of factoring 12 
into a product of two smaller integers, and these might lead to other ways of 
factoring 4095. For example, since 12 = 2 - 6, we could use the values a = 2 
and b = 6. Try computing the corresponding values of x and y and make 
sure their product is 4095. 

Although we already know that Conjecture 1 is incorrect, there are still 
interesting questions we can ask about it. If we continue checking prime 
numbers n to see if 2” — 1 is prime, will we continue to find 
counterexamples to the conjecture — examples for which 2” — 1 is not 
prime? Will we continue to find examples for which 2” — 1 is prime? If 
there were only finitely many prime numbers, then we might be able to 
investigate these questions by simply checking 2” — 1 for every prime 
number n. But in fact there are infinitely many prime numbers. Euclid 
(circa 300 BCE) gave a proof of this fact in Book IX of his Elements. His 
proof is one of the most famous in all of mathematics:! 


Theorem 3. There are infinitely many prime numbers. 


Proof. Suppose there are only finitely many prime numbers. Let p4, po,..., 
p, be a list of all prime numbers. Let m = p4p» * * * p, + 1. Note that m is not 
divisible by p4, since dividing m by p, gives a quotient of p,p3--+p, anda 
remainder of 1. Similarly, m is not divisible by any of p>, P3, .. - , Pn 

We now use the fact that every integer larger than 1 is either prime or can 
be written as a product of two or more primes. (We’ll see a proof of this fact 
in Chapter 6 — see Theorem 6.4.2.) Clearly m is larger than 1, so m is either 
prime or a product of primes. Suppose first that m is prime. Note that m is 
larger than all of the numbers in the list p,, po, . . . , Pp, SO we’ve found a 


prime number not in this list. But this contradicts our assumption that this 
was a list of all prime numbers. 

Now suppose m is a product of primes. Let g be one of the primes in this 
product. Then m is divisible by q. But we’ve already seen that m is not 


divisible by any of the numbers in the list p4, po, . . . , Pa, SO once again we 
have a contradiction with the assumption that this list included all prime 
numbers. 
Since the assumption that there are finitely many prime numbers has led 
to a contradiction, there must be infinitely many prime numbers. 
L 


Once again, you should not be concerned if some aspects of this proof 
seem mysterious. After you’ve read Chapter 3 yov’ll be better prepared to 
understand the proof in detail. We’ll return to this proof then and analyze its 
structure. 


We have seen that if n is not prime then 2” — 1 cannot be prime, but if n is 
prime then 2” — 1 can be either prime or composite. Because there are 
infinitely many prime numbers, there are infinitely many numbers of the 
form 2” — 1 that, based on what we know so far, might be prime. But how 
many of them are prime? 


Prime numbers of the form 2” — 1 are called Mersenne primes, after 
Father Marin Mersenne (1588—1648), a French monk and scholar who 
studied these numbers. Although many Mersenne primes have been found, 
it is still not known if there are infinitely many of them. Many of the largest 
known prime numbers are Mersenne primes. As of this writing (February 
2019), the largest known prime number is the Mersenne prime 28289.933 — 
1, anumber with 24,862,048 digits. 

Mersenne primes are related to perfect numbers, the subject of another 
famous unsolved problem of mathematics. A positive integer n is said to be 
perfect if n is equal to the sum of all positive integers smaller than n that 
divide n. (For any two integers m and n, we say that m divides n if n is 
divisible by m; in other words, if there is an integer q such that n = qm.) For 
example, the only positive integers smaller than 6 that divide 6 are 1, 2, and 
3, and 1+ 2+ 3 = 6. Thus, 6 is a perfect number. The next smallest perfect 
number is 28. (You should check for yourself that 28 is perfect by finding 
all the positive integers smaller than 28 that divide 28 and adding them up.) 

Euclid proved that if 2” - 1 is prime, then 2” '(2" - 1) is perfect. Thus, 
every Mersenne prime gives rise to a perfect number. Furthermore, about 
2000 years after Euclid’s proof, the Swiss mathematician Leonhard Euler 
(1707-1783), the most prolific mathematician in history, proved that every 


even perfect number arises in this way. (For example, note that 6 = 2!(2? - 
1) and 28 = 27(2° — 1).) Because it is not known if there are infinitely many 
Mersenne primes, it is also not known if there are infinitely many even 
perfect numbers. It is also not known if there are any odd perfect numbers. 
For proofs of the theorems of Euclid and Euler, see exercises 18 and 19 in 
Section 7.4. 

Although there are infinitely many prime numbers, the primes thin out as 
we look at larger and larger numbers. For example, there are 25 primes 
between 1 and 100, 16 primes between 1001 and 1100, and only six primes 
between 1,000,001 and 1,000,100. As our last introductory example of a 
proof, we show that there are long stretches of consecutive positive integers 
containing no primes at all. In this proof, we’ll use the following 
terminology: for any positive integer n, the product of all integers from 1 to 
nis called n factorial and is denoted n!. Thus, n! =1-2-3---n. As with 
our previous two proofs, we’ll return to this proof at the end of Chapter 3 to 
analyze its structure. 


Theorem 4. For every positive integer n, there is a sequence of n 
consecutive positive integers containing no primes. 


Proof. Suppose n is a positive integer. Let x = (n + 1)! +2. We will show 
that none of the numbers x, x + 1,x + 2,...,x+(n- 1) is prime. Since this 
is a sequence of n consecutive positive integers, this will prove the theorem. 
To see that x is not prime, note that 
X=1-2-3-4---(n+1)4+2 
=2-(1-:3-4---(n+1)+1). 
Thus, x can be written as a product of two smaller positive integers, so x is 
not prime. 
Similarly, we have 
X¥+1=1-2-3-4---(n+1)4+3 
=3-(1-2-4---(n+1)+ 1), 


so x + 1 is also not prime. In general, consider any number x + i, where 0 <i 
<n-1.Then we have 


x+ixz1-2-3-4---(n+1)+(4+2) 
= (i+2)-(1-2-3---(@4+1)-(@4+3)---(n+1)4+1), 


so X + i is not prime. 
L 


Theorem 4 shows that there are sometimes long stretches between one 
prime and the next prime. But primes also sometimes occur close together. 
Since 2 is the only even prime number, the only pair of consecutive integers 
that are both prime is 2 and 3. But there are lots of pairs of primes that 
differ by only two, for example, 5 and 7, 29 and 31, and 7949 and 7951. 
Such pairs of primes are called twin primes. It is not known whether there 
are infinitely many twin primes. 

Recently, significant progress has been made on the twin primes 
question. In 2013, Yitang Zhang (1955—) proved that there is a positive 
integer d < 70,000,000 such that there are infinitely many pairs of prime 
numbers that differ by d. Work of many other mathematicians in 2013-14 
narrowed down the possibilities for d to d < 246. Of course, if the statement 
holds with d = 2 then there are infinitely many twin primes. 


Exercises 


Note: Solutions or hints for exercises marked with an asterisk (*) are given 
in the appendix. 


*1, (a) Factor 2!° - 1 = 32,767 into a product of two smaller positive 
integers. 

(b) Find an integer x such that 1 < x < 292767 — 1 and 232767 — 1 is 
divisible by x. 

2. Make some conjectures about the values of n for which 3” — 1 is 
prime or the values of n for which 3” — 2” is prime. (You might start 
by making a table similar to Figure I.1.) 

*3. The proof of Theorem 3 gives a method for finding a prime number 
different from any in a given list of prime numbers. 


(a) Use this method to find a prime different from 2, 3, 5, and 7. 
(b) Use this method to find a prime different from 2, 5, and 11. 


Find five consecutive integers that are not prime. 


Use the table in Figure I.1 and the discussion on p. 5 to find two more 
perfect numbers. 


The sequence 3, 5, 7 is a list of three prime numbers such that each 
pair of adjacent numbers in the list differ by two. Are there any more 
such “triplet primes”? 


A pair of distinct positive integers (m, n) is called amicable if the sum 
of all positive integers smaller than n that divide n is m, and the sum 
of all positive integers smaller than m that divide m is n. Show that 
(220, 284) is amicable. 


Euclid phrased the theorem and proof somewhat differently. We have chosen to take a more 
modern approach in our presentation. 


1 


Sentential Logic 


1.1 Deductive Reasoning and Logical 
Connectives 


As we saw in the introduction, proofs play a central role in mathematics, 
and deductive reasoning is the foundation on which proofs are based. 
Therefore, we begin our study of mathematical reasoning and proofs by 
examining how deductive reasoning works. 


Example 1.1.1. Here are three examples of deductive reasoning: 


1. It will either rain or snow tomorrow. 
It’s too warm for snow. 
Therefore, it will rain. 


2. If today is Sunday, then I don’t have to go to work today. 
Today is Sunday. 
Therefore, I don’t have to go to work today. 


3. Iwill go to work either tomorrow or today. 
I’m going to stay home today. 
Therefore, I will go to work tomorrow. 


In each case, we have arrived at a conclusion from the assumption that 
some other statements, called premises, are true. For example, the premises 
in argument 3 are the statements “I will go to work either tomorrow or 
today” and “I’m going to stay home today.” The conclusion is “I will go to 
work tomorrow,” and it seems to be forced on us somehow by the premises. 

But is this conclusion really correct? After all, isn’t it possible that PI 
stay home today, and then wake up sick tomorrow and end up staying home 


again? If that happened, the conclusion would turn out to be false. But 
notice that in that case the first premise, which said that I would go to work 
either tomorrow or today, would be false as well! Although we have no 
guarantee that the conclusion is true, it can only be false if at least one of 
the premises is also false. If both premises are true, we can be sure that the 
conclusion is also true. This is the sense in which the conclusion is forced 
on us by the premises, and this is the standard we will use to judge the 
correctness of deductive reasoning. We will say that an argument is valid if 
the premises cannot all be true without the conclusion being true as well. 
All three of the arguments in our example are valid arguments. 


Here’s an example of an invalid deductive argument: 


Either the butler is guilty or the maid is guilty. 
Either the maid is guilty or the cook is guilty. 
Therefore, either the butler is guilty or the cook is guilty. 


The argument is invalid because the conclusion could be false even if both 
premises are true. For example, if the maid were guilty, but the butler and 
the cook were both innocent, then both premises would be true and the 
conclusion would be false. 

We can learn something about what makes an argument valid by 
comparing the three arguments in Example 1.1.1. On the surface it might 
seem that arguments 2 and 3 have the most in common, because they’re 
both about the same subject: attendance at work. But in terms of the 
reasoning used, arguments 1 and 3 are the most similar. They both introduce 
two possibilities in the first premise, rule out the second one with the 
second premise, and then conclude that the first possibility must be the 
case. In other words, both arguments have the form: 


PorQ. 
Not Q. 
Therefore, P. 


It is this form, and not the subject matter, that makes these arguments valid. 
You can see that argument 1 has this form by thinking of the letter P as 
standing for the statement “It will rain tomorrow,” and Q as standing for “It 
will snow tomorrow.” For argument 3, P would be “I will go to work 
tomorrow,” and Q would be “I will go to work today.” 


Replacing certain statements in each argument with letters, as we have in 
stating the form of arguments 1 and 3, has two advantages. First, it keeps us 
from being distracted by aspects of the arguments that don’t affect their 
validity. You don’t need to know anything about weather forecasting or 
work habits to recognize that arguments 1 and 3 are valid. That’s because 
both arguments have the form shown earlier, and you can tell that this 
argument form is valid without even knowing what P and Q stand for. If 
you don’t believe this, consider the following argument: 


Either the framger widget is misfiring, or the wrompal mechanism is out 
of alignment. 


I’ve checked the alignment of the wrompal mechanism, and it’s fine. 
Therefore, the framger widget is misfiring. 


If a mechanic gave this explanation after examining your car, you might 
still be mystified about why the car won’t start, but you’d have no trouble 
following his logic! 

Perhaps more important, our analysis of the forms of arguments 1 and 3 
makes clear what is important in determining their validity: the words or 
and not. In most deductive reasoning, and in particular in mathematical 
reasoning, the meanings of just a few words give us the key to 
understanding what makes a piece of reasoning valid or invalid. (Which are 
the important words in argument 2 in Example 1.1.1?) The first few 
chapters of this book are devoted to studying those words and how they are 
used in mathematical writing and reasoning. 

In this chapter, we’!l concentrate on words used to combine statements to 
form more complex statements. We’ll continue to use letters to stand for 
statements, but only for unambiguous statements that are either true or 
false. Questions, exclamations, and vague statements will not be allowed. It 
will also be useful to use symbols, sometimes called connective symbols, to 
stand for some of the words used to combine statements. Here are our first 
three connective symbols and the words they stand for: 


Symbol Meaning 


or 
and 


~“ not 


Thus, if P and Q stand for two statements, then we’ll write P V Q to 
stand for the statement “P or Q,” P A Q for “P and Q,” and -P for “not P 
“or “P is false.” The statement P V Q is sometimes called the disjunction of 
P and Q, P A Q is called the conjunction of P and Q, and ~ P is called the 
negation of P. 


Example 1.1.2. Analyze the logical forms of the following statements: 


1. Either John went to the store, or we’re out of eggs. 
2. Joe is going to leave home and not come back. 


3. Either Bill is at work and Jane isn’t, or Jane is at work and Bill isn’t. 
Solutions 


1. If we let P stand for the statement “John went to the store” and Q 
stand for “We’re out of eggs,” then this statement could be 
represented symbolically as P V Q. 


2. If we let P stand for the statement “Joe is going to leave home” and Q 
stand for “Joe is not going to come back,” then we could represent 
this statement symbolically as P A Q. But this analysis misses an 
important feature of the statement, because it doesn’t indicate that Q 
is a negative statement. We could get a better analysis by letting R 
stand for the statement “Joe is going to come back” and then writing 
the statement Q as ~R. Plugging this into our first analysis of the 
original statement, we get the improved analysis P A ~R. 


3. Let B stand for the statement “Bill is at work” and J for the statement 
“Jane is at work.” Then the first half of the statement, “Bill is at work 
and Jane isn’t,” can be represented as B A ~J. Similarly, the second 
half is J A =B. To represent the entire statement, we must combine 
these two with or, forming their disjunction, so the solution is (B A 
aJ) V (J A ~B). 


Notice that in analyzing the third statement in the preceding example, we 
added parentheses when we formed the disjunction of B A ~J and J A ~B to 
indicate unambiguously which statements were being combined. This is 
like the use of parentheses in algebra, in which, for example, the product of 
a+ b anda — b would be written (a + b) - (a — b), with the parentheses 


serving to indicate unambiguously which quantities are to be multiplied. As 
in algebra, it is convenient in logic to omit some parentheses to make our 
expressions shorter and easier to read. However, we must agree on some 
conventions about how to read such expressions so that they are still 
unambiguous. One convention is that the symbol ~ always applies only to 
the statement that comes immediately after it. For example, =P A Q means 
(=P) A Q rather than =(P A Q). We’ll see some other conventions about 
parentheses later. 


Example 1.1.3. What English sentences are represented by the following 
expressions? 


1. (AS A L) V S, where S stands for “John is smart” and L stands for 
“John is lucky.” 


2. ~S A (L V S), where S and L have the same meanings as before. 
3. -(S A L) V S, with S and L still as before. 


Solutions 


1. Either John isn’t smart and he is lucky, or he’s smart. 


2. John isn’t smart, and either he’s lucky or he’s smart. Notice how the 
placement of the word either in English changes according to where 
the parentheses are. 


3. Either John isn’t both smart and lucky, or John is smart. The word 
both in English also helps distinguish the different possible positions 
of parentheses. 


It is important to keep in mind that the symbols A, V, and ~ don’t really 
correspond to all uses of the words and, or, and not in English. For 
example, the symbol A could not be used to represent the use of the word 
and in the sentence “John and Bill are friends,” because in this sentence the 
word and is not being used to combine two statements. The symbols A and 
V can only be used between two statements, to form their conjunction or 
disjunction, and the symbol ~ can only be used before a statement, to 
negate it. This means that certain strings of letters and symbols are simply 
meaningless. For example, P = A Q, P A V Q, and P - Q are all 
“ungrammatical” expressions in the language of logic. “Grammatical” 


expressions, such as those in Examples 1.1.2 and 1.1.3, are sometimes 
called well-formed formulas or just formulas. Once again, it may be helpful 
to think of an analogy with algebra, in which the symbols +, —, -, and + can 
be used between two numbers, as operators, and the symbol — can also be 
used before a number, to negate it. These are the only ways that these 
symbols can be used in algebra, so expressions such as x — + y are 
meaningless. 

Sometimes, words other than and, or, and not are used to express the 
meanings represented by A, V, and ~. For example, consider the first 
Statement in Example 1.1.3. Although we gave the English translation 
“Either John isn’t smart and he is lucky, or he’s smart,” an alternative way 
of conveying the same information would be to say “Either John isn’t smart 
but he is lucky, or he’s smart.” Often, the word but is used in English to 
mean and, especially when there is some contrast or conflict between the 
Statements being combined. For a more striking example, imagine a 
weather forecaster ending his forecast with the statement “Rain and snow 
are the only two possibilities for tomorrow’s weather.” This is just a 
roundabout way of saying that it will either rain or snow tomorrow. Thus, 
even though the forecaster has used the word and, the meaning expressed 
by his statement is a disjunction. The lesson of these examples is that to 
determine the logical form of a statement you must think about what the 
statement means, rather than just translating word by word into symbols. 

Sometimes logical words are hidden within mathematical notation. For 
example, consider the statement 3 < z. Although it appears to be a simple 
statement that contains no words of logic, if you read it out loud you will 
hear the word or. If we let P stand for the statement 3 < z and Q for the 
statement 3 = 7, then the statement 3 < 7 would be written P V Q. In this 
example the statements represented by the letters P and Q are so short that 
it hardly seems worthwhile to abbreviate them with single letters. In cases 
like this we will sometimes not bother to replace the statements with letters, 
so we might also write this statement as (3 < 2) V (3 = 71). 

For a slightly more complicated example, consider the statement 3 < 7 < 
4. This statement means 3 < m and z < 4, so once again a word of logic has 
been hidden in mathematical notation. Filling in the meaning that we just 
worked out for 3 < 7, we can write the whole statement as [(3 < m) V (3 = 
m)] A (x < 4). Knowing that the statement has this logical form might be 


important in understanding a piece of mathematical reasoning involving this 
statement. 


Exercises 


ale 
(a) 


Analyze the logical forms of the following statements: 


We’ll have either a reading assignment or homework problems, but 
we won’t have both homework problems and a test. 
You won’t go skiing, or you will and there won’t be any snow. 


V7 £2 


Ls 


Analyze the logical forms of the following statements: 


Either John and Bill are both telling the truth, or neither of them is. 
Pl have either fish or chicken, but I won’t have both fish and mashed 
potatoes. 

3 is acommon divisor of 6, 9, and 15. 


Analyze the logical forms of the following statements: 


Alice and Bob are not both in the room. 
Alice and Bob are both not in the room. 
Either Alice or Bob is not in the room. 
Neither Alice nor Bob is in the room. 


Analyze the logical forms of the following statements: 


Either both Ralph and Ed are tall, or both of them are handsome. 
Both Ralph and Ed are either tall or handsome. 

Both Ralph and Ed are neither tall nor handsome. 

Neither Ralph nor Ed is both tall and handsome. 


Which of the following expressions are well-formed formulas? 

-(P, Q, A R). 

PaP. 

(P A Q)(P NR). 

Let P stand for the statement “I will buy the pants” and S for the 


statement “I will buy the shirt.” What English sentences are 
represented by the following formulas? 


(b) 


(c) 
(d) 


1.2 


-=(P A -S). 
-P A -S. 
-P V -S. 


Let S stand for the statement “Steve is happy” and G for “George is 
happy.” What English sentences are represented by the following 
formulas? 


(SV G) A (7S V -G). 
[S V(GA-S)] V -G. 
S v [G A G S V -G)]. 


Let T stand for the statement “Taxes will go up” and D for “The 
deficit will go up.” What English sentences are represented by the 
following formulas? 


TV D. 
+(T A D) A -(AT A =D). 
(T A~ D)V (D A-T). 


Identify the premises and conclusions of the following deductive 
arguments and analyze their logical forms. Do you think the reasoning 
is valid? (Although you will have only your intuition to guide you in 
answering this last question, in the next section we will develop some 
techniques for determining the validity of arguments.) 


Jane and Pete won’t both win the math prize. Pete will win either the 
math prize or the chemistry prize. Jane will win the math prize. 
Therefore, Pete will win the chemistry prize. 

The main course will be either beef or fish. The vegetable will be 
either peas or corn. We will not have both fish as a main course and 
corn as a vegetable. Therefore, we will not have both beef as a main 
course and peas as a vegetable. 

Either John or Bill is telling the truth. Either Sam or Bill is lying. 
Therefore, either John is telling the truth or Sam is lying. 

Either sales will go up and the boss will be happy, or expenses will go 
up and the boss won’t be happy. Therefore, sales and expenses will 
not both go up. 


Truth Tables 


We saw in Section 1.1 that an argument is valid if the premises cannot all be 
true without the conclusion being true as well. Thus, to understand how 
words such as and, or, and not affect the validity of arguments, we must see 
how they contribute to the truth or falsity of statements containing them. 

When we evaluate the truth or falsity of a statement, we assign to it one 
of the labels true or false, and this label is called its truth value. It is clear 
how the word and contributes to the truth value of a statement containing it. 
A statement of the form P A Q can be true only if both P and Q are true; if 
either P or Q is false, then P A Q will be false too. Because we have 
assumed that P and Q both stand for statements that are either true or false, 
we can summarize all the possibilities with the table shown in Figure 1.1. 
This is called a truth table for the formula P A Q. Each row in the truth 
table represents one of the four possible combinations of truth values for the 
statements P and Q. Although these four possibilities can appear in the table 
in any order, it is best to list them systematically so we can be sure that no 
possibilities have been skipped. The truth table for ~P is also quite easy to 
construct because for ~P to be true, P must be false. The table is shown in 
Figure 1.2. 


P Q PAQ 
F F F 
F T F 
T F F 
T T T 
Figure 1.1. 
P., aP 
F T 
j} F 
Figure 1.2. 


The truth table for P V Q is a little trickier. The first three lines should 
certainly be filled in as shown in Figure 1.3, but there may be some 
question about the last line. Should P V Q be true or false in the case in 
which P and Q are both true? In other words, does P V Q mean “P or Q, or 
both” or does it mean “P or Q but not both”? The first way of interpreting 


the word or is called the inclusive or (because it includes the possibility of 
both statements being true), and the second is called the exclusive or. In 
mathematics, or always means inclusive or, unless specified otherwise, so 
we will interpret V as inclusive or. We therefore complete the truth table for 
P V Qas shown in Figure 1.4. See exercise 3 for more about the exclusive 
or. 


P Q PVQ 

F F F 

F T T 

T F T 

T T ? 
Figure 1.3. 

P Q PVQ 

F F F 

F T T 

T F T 

T T T 
Figure 1.4. 


Using the rules summarized in these truth tables, we can now work out 
truth tables for more complex formulas. All we have to do is work out the 
truth values of the component parts of a formula, starting with the 
individual letters and working up to more complex formulas a step at a 
time. 


Example 1.2.1. Make a truth table for the formula ~(P V ~Q). 


Solution 
P Q -Q Pv-Q ~(Pv=Q) 
F F T T F 
F T F F T 
T F T T F 
T T F T F 


The first two columns of this table list the four possible combinations of 
truth values of P and Q. The third column, listing truth values for the 


formula ~Q, is found by simply negating the truth values for Q in the 
second column. The fourth column, for the formula P V-=Q, is found by 
combining the truth values for P and ~Q listed in the first and third 
columns, according to the truth value rule for V summarized in Figure 1.4. 
According to this rule, P V ~Q will be false only if both P and ~Q are false. 
Looking in the first and third columns, we see that this happens only in row 
two of the table, so the fourth column contains an F in the second row and 
T’s in all other rows. Finally, the truth values for the formula =(P V ~Q) are 
listed in the fifth column, which is found by negating the truth values in the 
fourth column. (Note that these columns had to be worked out in order, 
because each was used in computing the next.) 


Example 1.2.2. Make a truth table for the formula =(P A Q) V ~R. 


Solution 


~ 
~ 
d 
a 
© 
J 
c3 
I 
~ 
© 
J 
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Note that because this formula contains three letters, it takes eight lines to 
list all possible combinations of truth values for these letters. (If a formula 
contains n different letters, how many lines will its truth table have?) 


Here’s a way of making truth tables more compactly. Instead of using 
separate columns to list the truth values for the component parts of a 
formula, just list those truth values below the corresponding connective 
symbol in the original formula. This is illustrated in Figure 1.5, for the 
formula from Example 1.2.1. In the first step, we have listed the truth 
values for P and Q below these letters where they appear in the formula. In 
step two, the truth values for =Q have been added under the ~ symbol for 
=Q. In the third step, we have combined the truth values for P and ~Q to get 
the truth values for P V ~Q, which are listed under the V symbol. Finally, in 
the last step, these truth values are negated and listed under the initial - 


symbol. The truth values added in the last step give the truth value for the 
entire formula, so we will call the symbol under which they are listed (the 
first = symbol in this case) the main connective of the formula. Notice that 
the truth values listed under the main connective in this case agree with the 
values we found in Example 1.2.1. 


Step | Step 2 
P Q -(PVv-7Q) P Q -(PVvV-7@Q) 
F F I F F F F TF 
F T F = F T F FT 
T F T F T F T TF 
T T T T r L T FT 
Step 3 Step 4 
P Q -(PV-7Q) P Q -~(PV-7@Q) 
F F FTTF F F FFTTF 
F T FF FT F T TFFFT 
T F TTTF T F FEFTTTF 
T T TTFT T T FTTFT 


Figure 1.5. 


Now that we know how to make truth tables for complex formulas, we’re 
ready to return to the analysis of the validity of arguments. Consider again 
our first example of a deductive argument: 


It will either rain or snow tomorrow. 
It’s too warm for snow. 
Therefore, it will rain. 


As we have seen, if we let P stand for the statement “It will rain tomorrow” 
and Q for the statement “It will snow tomorrow,” then we can represent the 
argument symbolically as follows: 


PVQ 
=Q 


P (The symbol .. means therefore.) 


We can now see how truth tables can be used to verify the validity of this 
argument. Figure 1.6 shows a truth table for both premises and the 
conclusion of the argument. Recall that we decided to call an argument 
valid if the premises cannot all be true without the conclusion being true as 
well. Looking at Figure 1.6 we see that the only row of the table in which 
both premises come out true is row three, and in this row the conclusion is 
also true. Thus, the truth table confirms that if the premises are all true, the 
conclusion must also be true, so the argument is valid. 


Premises Conclusion 


P Q PVvVQ -@Q P 

F F F T F 

F T T F F 

T F T T T 

T T T F T 
Figure 1.6. 


Example 1.2.3. Determine whether the following arguments are valid. 


1. Either John isn’t smart and he is lucky, or he’s smart. 
John is smart. 
Therefore, John isn’t lucky. 


2. The butler and the cook are not both innocent. 
Either the butler is lying or the cook is innocent. 
Therefore, the butler is either lying or guilty. 


Solutions 


1. Asin Example 1.1.3, we let S stand for the statement “John is smart” 
and L stand for “John is lucky.” Then the argument has the form: 


(ASAL)VS 
S 
Ae) 


Now we make a truth table for both premises and the conclusion. 
(You should work out the intermediate steps in deriving column three 
of this table to confirm that it is correct.) 


Premises Conclusion 


$S L (aASAL)VS 5 aL 
F F F F T 
F T T F F 
T F T T T 
T T T: T F 


Both premises are true in lines three and four of this table. The 
conclusion is also true in line three, but it is false in line four. Thus, it 
is possible for both premises to be true and the conclusion false, so 
the argument is invalid. In fact, the table shows us exactly why the 
argument is invalid. The problem occurs in the fourth line of the 
table, in which S and L are both true — in other words, John is both 
smart and lucky. Thus, if John is both smart and lucky, then both 
premises will be true but the conclusion will be false, so it would be a 
mistake to infer that the conclusion must be true from the assumption 
that the premises are true. 


Let B stand for the statement “The butler is innocent,” C for the 
statement “The cook is innocent,” and L for the statement “The butler 
is lying.” Then the argument has the form: 


~(BAC) 
LYC 
LV AB 
Here is the truth table for the premises and conclusion: 


Premises Conclusion 


B C L (BAC) LVC LV =B 
F F F T F T 
F F T T T T 
F T F T T T 
F T T T T T 
T F F T F F 
T F T T T T 
T T F F T F 
T T T F T T 


The premises are both true only in lines two, three, four, and six, and 
in each of these cases the conclusion is true as well. Therefore, the 
argument is valid. 


If you expected the first argument in Example 1.2.3 to turn out to be 
valid, it’s probably because the first premise confused you. It’s a rather 
complicated statement, which we represented symbolically with the formula 
(aS A L) V S. According to our truth table, this formula is false if S and L 
are both false, and true otherwise. But notice that this is exactly the same as 
the truth table for the simpler formula L V S! Because of this, we say that 
the formulas (~S A L) V S and L V S are equivalent. Equivalent formulas 
always have the same truth value no matter what statements the letters in 
them stand for and no matter what the truth values of those statements are. 
The equivalence of the premise (= S A L) V S and the simpler formula L V 
S may help you understand why the argument is invalid. Translating the 
formula L V S back into English, we see that the first premise could have 
been stated more simply as “John is either lucky or smart (or both).” But 
from this premise and the second premise (that John is smart), it clearly 
doesn’t follow that he’s not lucky, because he might be both smart and 
lucky. 


Example 1.2.4. Which of these formulas are equivalent? 
-(P A Q), aP A =Q, =P V ~Q. 
Solution 


Here’s a truth table for all three statements. (You should check it yourself!) 


P Q (PAQ) 7PA7Q -PV70Q 


F F T T T 
F T T F T 
T F T F T 
E T F F F 


The third and fifth columns in this table are identical, but they are 
different from the fourth column. Therefore, the formulas ~(P A Q) and ~P 
V ~Q are equivalent, but neither is equivalent to the formula =P A aQ. This 
should make sense if you think about what all the symbols mean. For 
example, suppose P stands for the statement “The Yankees won last night” 
and Q stands for “The Red Sox won last night.” Then =(P A Q) would 
represent the statement “The Yankees and the Red Sox did not both win last 
night,” and ~P V ~Q would represent “Either the Yankees or the Red Sox 
lost last night”; these statements clearly convey the same information. On 


the other hand, ~P A ~Q would represent “The Yankees and the Red Sox 
both lost last night,” which means something entirely different. 

You can check for yourself by making a truth table that the formula =P A 
~Q from Example 1.2.4 is equivalent to the formula =(P V Q). (To see that 
this equivalence makes sense, notice that the statements “Both the Yankees 
and the Red Sox lost last night” and “It is not the case that either the 
Yankees or the Red Sox won last night” mean the same thing.) This 
equivalence and the one discovered in Example 1.2.4 are called De 
Morgan’s laws. They are named for the British mathematician Augustus De 
Morgan (1806-1871). 

In analyzing deductive arguments and the statements that occur in them, 
it is helpful to be familiar with a number of equivalences that come up 
often. Verify the equivalences in the following list yourself by making truth 
tables, and check that they make sense by translating the formulas into 
English, as we did in Example 1.2.4. 


De Morgan’s laws 


=(P A Q) is equivalent to —P v ~Q. 
=(P v Q) is equivalent to ~P A-@Q. 


Commutative laws 


P A Q is equivalent to Q A P. 
P v Q is equivalent to Q v P. 


Associative laws 


P A^ (Q A R) is equivalent to (P A Q) A R. 
P v (Q vV R) is equivalent to (P V Q) V R. 


Idempotent laws 


P A P is equivalent to P. 
P v P is equivalent to P. 


Distributive laws 


P A (Q vV R) is equivalent to (P A Q) V (P A R). 
P v (Q A R) is equivalent to (P v Q) A^ (P V R). 


Absorption laws 


P v (P A Q) is equivalent to P. 


Double Negation law 
—=—>P is equivalent to P. 


Notice that because of the associative laws we can leave out parentheses 
in formulas of the forms P A Q A Rand P V Q V R without worrying that 
the resulting formula will be ambiguous, because the two possible ways of 
filling in the parentheses lead to equivalent formulas. 

Many of the equivalences in the list should remind you of similar rules 
involving +, -, and — in algebra. As in algebra, these rules can be applied to 
more complex formulas, and they can be combined to work out more 
complicated equivalences. Any of the letters in these equivalences can be 
replaced by more complicated formulas, and the resulting equivalence will 
still be true. For example, by replacing P in the double negation law with 
the formula Q V ~R, you can see that ==(Q V ~R) is equivalent to Q V ~R. 
Also, if two formulas are equivalent, you can always substitute one for the 
other in any expression and the results will be equivalent. For example, 
since =-P is equivalent to P, if =P occurs in any formula, you can always 
replace it with P and the resulting formula will be equivalent to the original. 


Example 1.2.5. Find simpler formulas equivalent to these formulas: 


1. -=(P V ~Q). 
2. A(QA-P) VP. 
Solutions 
1. -=(P V aQ) 
is equivalent to ~-P A--Q (De Morgan’s law), 
which is equivalent to ~-P A Q (double negation law). 


You can check that this equivalence is right by making a truth table 
for =P A Q and seeing that it is the same as the truth table for =(P V 


=Q) found in Example 1.2.1. 
2. -(QA-=P)VP 


is equivalent to(-Q@ V—-—P)VvP (De Morgan’s law), 


which is equivalent to (~Q v P) v P (double negation law), 
which is equivalent to =Q v (P v P) (associative law), 
which is equivalent to ~Q v P (idempotent law). 


Some equivalences are based on the fact that certain formulas are either 
always true or always false. For example, you can verify by making a truth 
table that the formula Q A (P V ~P) is equivalent to just Q. But even before 
you make the truth table, you can probably see why they are equivalent. In 
every line of the truth table, P V ~P will come out true, and therefore Q A 
(P V ~ P) will come out true when Q is also true, and false when Q is false. 
Formulas that are always true, such as P V ~P, are called tautologies. 
Similarly, formulas that are always false are called contradictions. For 
example, P A =P is a contradiction. 


Example 1.2.6. Are these formulas tautologies, contradictions, or neither? 
PvV(QV-P) PA-7A(QV-7Q), PV=A(QV-7Q). 
Solution 


First we make a truth table for all three formulas. 


P Q Pv (Q V—oP) PA Q V ~Q ) Pyvn7l Q V =} ) 


F F T F F 
F T T F F 
T F T F T 
T T T F T 


From the truth table it is clear that the first formula is a tautology, the 
second a contradiction, and the third neither. In fact, since the last column is 
identical to the first, the third formula is equivalent to P. 


We can now state a few more useful laws involving tautologies and 
contradictions. You should be able to convince yourself that all of these 


laws are correct by thinking about what the truth tables for the statements 
involved would look like. 


Tautology laws 


P A (a tautology) is equivalent to P. 
P V (a tautology) is a tautology. 
-(a tautology) is a contradiction. 


Contradiction laws 


P A (acontradiction) is a contradiction. 
P V (a contradiction) is equivalent to P. 
=(a contradiction) is a tautology. 


Example 1.2.7. Find simpler formulas equivalent to these formulas: 
1. PV(QA-P). 
2. (PV (QAR) AQ. 
Solutions 
1. PV(QA-P) 
is equivalent to (P Vv Q) A (P V >P) (distributive law), 


which is equivalent to P V Q (tautology law). 


The last step uses the fact that P V ~P is a tautology. 
2. -(P V(QA-=R)) AQ 


is equivalent to (~P A-—(Q A-R)) AQ (De Morgan’s law), 
which is equivalent to (~P A (~Q v —--R)) ^A Q (De Morgan’s law), 


which is equivalent to (~P A (~Q v R)) A Q (double negation law), 
which is equivalent to ~P A ((~0 v R) A Q) (associative law), 
which is equivalent to -P A (Q A^ (~Q v R)) (commutative law), 


which is equivalent to ~P A ((Q A >Q) v (Q A R)) 
(distributive law), 


which is equivalent to ~P A (Q A R) (contradiction law). 


The last step uses the fact that Q A ~Q is a contradiction. Finally, by 
the associative law for A we can remove the parentheses without 
making the formula ambiguous, so the original formula is equivalent 
to the formula -~P AQ AR. 


Exercises 


*1, 
(a) 
(b) 

2. 
(a) 
(b) 

3. 


(a) 
(b) 


Make truth tables for the following formulas: 


-P V Q. 
(SV G) A(-SV -G). 


Make truth tables for the following formulas: 

LP A (Q V =P)]. 

(PV Q)ACP VR). 

In this exercise we will use the symbol + to mean exclusive or. In 
other words, P + Q means “P or Q, but not both.” 


Make a truth table for P + Q. 
Find a formula using only the connectives A, V, and ~ that is 
equivalent to P + Q. Justify your answer with a truth table. 


Find a formula using only the connectives A and ~ that is equivalent 
to P V Q. Justify your answer with a truth table. 

Some mathematicians use the symbol | to mean nor. In other words, 
P | Q means “neither P nor Q.” 


Make a truth table for P | Q. 

Find a formula using only the connectives A, V, and ~ that is 
equivalent to P | Q. 

Find formulas using only the connective | that are equivalent to ~P, P 
V Q,andP AQ. 


Some mathematicians write P | Q to mean “P and Q are not both 
true.” (This connective is called nand, and is used in the study of 
circuits in computer science.) 


Make a truth table for P | Q. 


Find a formula using only the connectives A, V, and ~ that is 
equivalent to P | Q. 


(c) 


Find formulas using only the connective | that are equivalent to =P, P 
VQ, andP AQ. 


*7, Use truth tables to determine whether or not the arguments in exercise 


9 of Section 1.1 are valid. 


Use truth tables to determine which of the following formulas are 
equivalent to each other: 

(PAQ)V GP Ang). 

“=P V Q. 

(P V =Q) A (Q V =P). 

~(P V Q). 

(QAP)V-P. 

Use truth tables to determine which of these statements are 
tautologies, which are contradictions, and which are neither: 


(PV Q) A GP VAQ). 

(PV Q) A GP A -Q). 

(PV Q) V GP V -Q). 

[P A(Q V wAR)] V (=P V R). 

Use truth tables to check these laws: 


The second De Morgan’s law. (The first was checked in the text.) 
The distributive laws. 


Use the laws stated in the text to find simpler formulas equivalent to 
these formulas. (See Examples 1.2.5 and 1.2.7.) 

-a(-=P A =Q). 

(P A Q) V (P A =Q). 

(P A =Q) V (FP A Q). 

Use the laws stated in the text to find simpler formulas equivalent to 
these formulas. (See Examples 1.2.5 and 1.2.7.) 

a(-P V Q) V (P N-R). 

=(P AQ) V (P aR). 

(PAR V[ARA(PV Q). 

Use the first De Morgan’s law and the double negation law to derive 
the second De Morgan’s law. 


*14, 


15. 


*16. 


17. 


18. 


Note that the associative laws say only that parentheses are 
unnecessary when combining three statements with A or V. In fact, 
these laws can be used to justify leaving parentheses out when more 
than three statements are combined. Use associative laws to show that 
[P A (Q A R)] A Sis equivalent to (P A Q) A (RAS). 

How many lines will there be in the truth table for a statement 
containing n letters? 

Find a formula involving the connectives A, V, and ~ that has the 
following truth table: 


Jom Tt 
Sms aneo 


T 


Find a formula involving the connectives A, V, and ~ that has the 
following truth table: 


P o m 
F F F 
F T T 
T F T 
T T F 


Suppose the conclusion of an argument is a tautology. What can you 
conclude about the validity of the argument? What if the conclusion is 
a contradiction? What if one of the premises is either a tautology or a 
contradiction? 


1.3 Variables and Sets 


In mathematical reasoning it is often necessary to make statements about 
objects that are represented by letters called variables. For example, if the 
variable x is used to stand for a number in some problem, we might be 
interested in the statement “x is a prime number.” Although we may 
sometimes use a single letter, say P, to stand for this statement, at other 
times we will revise this notation slightly and write P(x), to stress that this 
is a statement about x. The latter notation makes it easy to talk about 


assigning a value to x in the statement. For example, P(7) would represent 
the statement “7 is a prime number,” and P(a + b) would mean “a + bis a 
prime number.” If a statement contains more than one variable, our 
abbreviation for the statement will include a list of all the variables 
involved. For example, we might represent the statement “p is divisible by 
q” by D(p, q). In this case, D(12, 4) would mean “12 is divisible by 4.” 

Although you have probably seen variables used most often to stand for 
numbers, they can stand for anything at all. For example, we could let M(x) 
stand for the statement “x is a man,” and W(x) for “x is a woman.” In this 
case, we are using the variable x to stand for a person. A statement might 
even contain several variables that stand for different kinds of objects. For 
example, in the statement “x has y children,” the variable x stands for a 
person, and y stands for a number. 

Statements involving variables can be combined using connectives, just 
like statements without variables. 


Example 1.3.1. Analyze the logical forms of the following statements: 


1. xis a prime number, and either y or z is divisible by x. 


2. xis aman and y is a woman and x likes y, but y doesn’t like x. 
Solutions 


1. We could let P stand for the statement “x is a prime number,” D for “y 
is divisible by x,” and E for “z is divisible by x.” The entire statement 
would then be represented by the formula P A (D V E). But this 
analysis, though not incorrect, fails to capture the relationship 
between the statements D and E. A better analysis would be to let 
P(x) stand for “x is a prime number” and D(y, x) for “y is divisible by 
x.” Then D(z, x) would mean “z is divisible by x,” so the entire 
statement would be P(x) A(D(y, x) V D(z, x)). 


2. Let M(x) stand for “x is a man,” W(y) for “y is a woman,” and L(x, y) 
for “x likes y.” Then L(y, x) would mean “y likes x.” (Notice that the 
order of the variables after the L makes a difference!) The entire 
statement would then be represented by the formula M(x) A W(y) A 


L(x, y) A aL(y, x). 


In the last section, we introduced the idea of assigning truth values to 
statements. This idea is unproblematic for statements that do not contain 
variables, since such statements are either true or false. But if a statement 
contains variables, we can no longer describe the statement as being simply 
true or false. Its truth value might depend on the values of the variables 
involved. For example, if P(x) stands for the statement “x is a prime 
number,” then P(x) would be true if x = 23, but false if x = 22. To deal with 
this complication, we will define truth sets for statements containing 
variables. Before giving this definition, though, it might be helpful to 
review some basic definitions from set theory. 

A set is a collection of objects. The objects in the collection are called the 
elements of the set. The simplest way to specify a particular set is to list its 
elements between braces. For example, {3, 7, 14} is the set whose elements 
are the three numbers 3, 7, and 14. We use the symbol € to mean is an 
element of. For example, if we let A stand for the set {3, 7, 14}, then we 
could write 7 € A to say that 7 is an element of A. To say that 11 is not an 
element of A, we write 11 € A. 

A set is completely determined once its elements have been specified. 
Thus, two sets that have exactly the same elements are always equal. Also, 
when a set is defined by listing its elements, all that matters is which objects 
are in the list of elements, not the order in which they are listed. An element 
can even appear more than once in the list. Thus, {3, 7, 14}, {14, 3, 7}, and 
{3, 7, 14, 7} are three different names for the same set. 

It may be impractical to define a set that contains a very large number of 
elements by listing all of its elements, and it would be impossible to give 
such a definition for a set that contains infinitely many elements. Often this 
problem can be overcome by listing a few elements with an ellipsis (...) 
after them, if it is clear how the list should be continued. For example, 
suppose we define a set B by saying that B = {2, 3, 5, 7, 11, 13, 17, ...}. 
Once you recognize that the numbers listed in the definition of B are the 
prime numbers, then you know that, for example, 23 E B, even though it 
wasn’t listed explicitly when we defined B. But this method requires 
recognition of the pattern in the list of numbers in the definition of B, and 
this requirement introduces an element of ambiguity and subjectivity into 
our notation that is best avoided in mathematical writing. It is therefore 
usually better to define such a set by spelling out the pattern that determines 
the elements of the set. 


In this case we could be explicit by defining B as follows: 
B = {x |x is a prime number}. 


This is read “B is equal to the set of all x such that x is a prime number,” 
and it means that the elements of B are the values of x that make the 
statement “x is a prime number” come out true. You should think of the 
statement “x is a prime number” as an elementhood test for the set. Any 
value of x that makes this statement come out true passes the test and is an 
element of the set. Anything else fails the test and is not an element. Of 
course, in this case the values of x that make the statement true are precisely 
the prime numbers, so this definition says that B is the set whose elements 
are the prime numbers, exactly as before. 


Example 1.3.2. Rewrite these set definitions using elementhood tests: 


1. E= {2, 4, 6, 8,...}. 


2. P = {George Washington, John Adams, Thomas Jefferson, James 
Madison, ...}. 


Solutions 


Although there might be other ways of continuing these lists of elements, 
probably the most natural ones are given by the following definitions: 


1. E= {n |n is a positive even integer}. 


2. P= {z | z was a president of the United States}. 


If a set has been defined using an elementhood test, then that test can be 
used to determine whether or not something is an element of the set. For 
example, consider the set {x | x2 < 9}. If we want to know if 5 is an element 
of this set, we simply apply the elementhood test in the definition of the set 
— in other words, we check whether or not 52 < 9. Since 52 = 25 > 9, it fails 
the test, so 5 € {x | x? < 9}. On the other hand, (-2)* = 4 < 9, so -2 € {x | 
x? < 9}. The same reasoning would apply to any other number. For any 
number y, to determine whether or not y € {x | x? < 9}, we just check 
whether or not y? < 9. In fact, we could think of the statement y € {x | x? < 
9} as just a roundabout way of saying y* < 9. 


Notice that because the statement y € {x | x? < 9} means the same thing 
as y? < 9, it is a statement about y, but not x! To determine whether or not y 
€E {x |x? < 9} you need to know what y is (so you can compare its square to 
9), but not what x is. We say that in the statement y € {x | x? < 9}, y is a 
free variable, whereas x is a bound variable (or a dummy variable). The free 
variables in a statement stand for objects that the statement says something 
about. Plugging in different values for a free variable affects the meaning of 
a statement and may change its truth value. The fact that you can plug in 
different values for a free variable means that it is free to stand for anything. 
Bound variables, on the other hand, are simply letters that are used as a 
convenience to help express an idea and should not be thought of as 
standing for any particular object. A bound variable can always be replaced 
by a new variable without changing the meaning of the statement, and often 
the statement can be rephrased so that the bound variables are eliminated 
altogether. For example, the statements y € {x | x? < 9} and y E {w | w? < 
9} mean the same thing, because they both mean “y is an element of the set 
of all numbers whose squares are less than 9.” In this last statement, all 
bound variables have been eliminated, and the only variable that appears in 
the statement is the free variable y. 


Note that x is a bound variable in the statement y € {x | x? < 9} even 
though it is a free variable in the statement x? < 9. This last statement is a 
statement about x that would be true for some values of x and false for 
others. It is only when this statement is used inside the elementhood test 
notation that x becomes a bound variable. We could say that the notation {x 
|... } binds the variable x. 


Everything we have said about the set {x | x? < 9} would apply to any set 
defined by an elementhood test. In general, the statement y E {x | P(x)} 
means the same thing as P(y), which is a statement about y but not x. 
Similarly, y € {x | P(x)} means the same thing as —P(y). Of course, the 
expression {x | P(x)} is not a statement at all; it is a name for a set. As you 
learn more mathematical notation, it will become increasingly important to 
make sure you are careful to distinguish between expressions that are 
mathematical statements and expressions that are names for mathematical 
objects. 


Example 1.3.3. What do these statements mean? What are the free 
variables in each statement? 


1. 


a +b € {x |x is an even number}. 


2. y € {x |x is divisible by w}. 


3. 2€ {w|6 € {x |x is divisible by w}}. 
Solutions 

1. This statement says that a + b is not an element of the set of all even 
numbers, or in other words, a + b is not an even number. Both a and b 
are free variables, but x is a bound variable. The statement will be true 
for some values of a and b and false for others. 

2. This statement says that y is divisible by w. Both y and w are free 
variables, but x is a bound variable. The statement is true for some 
values of y and w and false for others. 

3. This looks quite complicated, but if we go a step at a time, we can 


decipher it. First, note that the statement 6 È {x | x is divisible by w}, 
which appears inside the given statement, means the same thing as “6 
is not divisible by w.” Substituting this into the given statement, we 
find that the original statement is equivalent to the simpler statement 
2 € {w | 6 is not divisible by w}. But this just means the same thing 
as “6 is not divisible by 2.” Thus, the statement has no free variables, 
and both x and w are bound variables. Because there are no free 
variables, the truth value of the statement doesn’t depend on the 
values of any variables. In fact, since 6 is divisible by 2, the statement 
is false. 


Perhaps you have guessed by now how we can use set theory to help us 
understand truth values of statements containing free variables. As we have 
seen, a statement, say P(x), containing a free variable x, may be true for 
some values of x and false for others. To distinguish the values of x that 
make P(x) true from those that make it false, we could form the set of 
values of x for which P(x) is true. We will call this set the truth set of P(x). 


Definition 1.3.4. The truth set of a statement P(x) is the set of all values of 
x that make the statement P(x) true. In other words, it is the set defined by 


using the statement P(x) as an elementhood test: {x | P(x)}. 


Note that we have defined truth sets only for statements containing one 
free variable. We will discuss truth sets for statements with more than one 
free variable in Chapter 4. 


Example 1.3.5. What are the truth sets of the following statements? 


1. Shakespeare wrote x. 


2. nis an even prime number. 
Solutions 


1. {x | Shakespeare wrote x} = {Hamlet, Macbeth, Twelfth Night, ...}. 


2. {n|nis an even prime number}. Because the only even prime number 
is 2, this is the set {2}. Note that 2 and {2} are not the same thing! 
The first is a number, and the second is a set whose only element is a 
number. Thus, 2 E€ {2}, but 2 = {2}. 


Suppose A is the truth set of a statement P(x). According to the definition 
of truth set, this means that A = {x | P(x)}. We’ve already seen that for any 
object y, the statement y © {x | P(x)} means the same thing as P(y). 
Substituting in A for {x | P(x)}, it follows that y E A means the same thing 
as P(y). Thus, we see that in general, if A is the truth set of P(x), then to say 
that y E A means the same thing as saying P(y). 

When a statement contains free variables, it is often clear from context 
that these variables stand for objects of a particular kind. The set of all 
objects of this kind — in other words, the set of all possible values for the 
variables — is called the universe of discourse for the statement, and we say 
that the variables range over this universe. For example, in most contexts 
the universe for the statement x? < 9 would be the set of all real numbers; 
the universe for the statement “x is a man” might be the set of all people. 

Certain sets come up often in mathematics as universes of discourse, and 
it is convenient to have fixed names for them. Here are a few of the most 
important ones: 


IR = {x | x is areal number}. 


Q = {x | x is a rational number}. 


(Recall that a real number is any number on the number line, and a rational 
number is a number that can be written as a fraction p/q, where p and q are 
integers.) 

Z = {x |x is an integer} = {..., -3, -2, -1, 0, 1, 2, 3,...}. 


N = {x | x is a natural number} = {0, 1, 2, 3, ...}. 


(Some books include 0 as a natural number and some don’t. In this book, 
we consider 0 to be a natural number.) 


The letters R, Q, and Z can be followed by a superscript + or — to indicate 


that only positive or negative numbers are to be included in the set. For 
example, R* = {x | x is a positive real number}, and Z = {x | x is a negative 
integer}. 

Although the universe of discourse can usually be determined from 
context, it is sometimes useful to identify it explicitly. Consider a statement 
P(x) with a free variable x that ranges over a universe U. Although we have 
written the truth set of P(x) as {x | P(x)}, if there were any possibility of 
confusion about what the universe was, we could specify it explicitly by 
writing {x © U | P(x)}; this is read “the set of all x in U such that P(x).” 
This notation indicates that only elements of U are to be considered for 
elementhood in this truth set, and among elements of U, only those that pass 
the elementhood test P(x) will actually be in the truth set. For example, 
consider again the statement x? < 9. If the universe of discourse for this 
statement were the set of all real numbers, then its truth set would be {x © 
R | x? < 9}, or in other words, the set of all real numbers between -3 and 3. 


But if the universe were the set of all integers, then the truth set would be {x 
E€ Z |x? < 9} = {-2, -1, 0, 1, 2}. Thus, for example, 1.58 € {x E R | x? < 
9} but 1.58 € {x E Z | x? < 9}. Clearly, the choice of universe can 
sometimes make a difference! 

Sometimes this explicit notation is used not to specify the universe of 
discourse but to restrict attention to just a part of the universe. For example, 
in the case of the statement x? < 9, we might want to consider the universe 
of discourse to be the set of all real numbers, but in the course of some 
reasoning involving this statement we might want to temporarily restrict our 
attention to only positive real numbers. We might then be interested in the 
set {x E IR* | x? < 9}. As before, this notation indicates that only positive 


real numbers will be considered for elementhood in this set, and among 
positive real numbers, only those whose square is less than 9 will be in the 
set. Thus, for a number to be an element of this set, it must pass two tests: it 
must be a positive real number, and its square must be less than 9. In other 
words, the statement y € {x € R* | x? < 9} means the same thing as y E R* 


A y? <9. In general, y E {x € A | P(x)} means the same thing as y E A A 
P(y). 

When a new mathematical concept has been defined, mathematicians are 
usually interested in studying any possible extremes of this concept. For 
example, when we discussed truth tables, the extremes we studied were 
statements whose truth tables contained only T’s (tautologies) or only F’s 
(contradictions). For the concept of the truth set of a statement containing a 
free variable, the corresponding extremes would be the truth sets of 
statements that are always true or always false. Suppose P(x) is a statement 
containing a free variable x that ranges over a universe U. It should be clear 
that if P(x) comes out true for every value of x in U, then the truth set of 
P(x) will be the whole universe U. For example, since the statement x? > 0 
is true for every real number x, the truth set of this statement is {x € R | x? 
> 0} = R. Of course, this is not unrelated to the concept of a tautology. For 


example, since P V ~P is a tautology, the statement P(x) V —P(x) will be 
true for every x © U, no matter what statement P(x) stands for or what the 
universe U is, and therefore the truth set of the statement P(x) V ~P(x) will 
be U. 

For a statement P(x) that is false for every possible value of x, nothing in 
the universe can pass the elementhood test for the truth set of P(x), and so 
this truth set must have no elements. The idea of a set with no elements may 
sound strange, but it arises naturally when we consider truth sets for 
statements that are always false. Because a set is completely determined 
once its elements have been specified, there is only one set that has no 
elements. It is called the empty set, or the null set, and is often denoted ©. 
For example, {x E Z | x = x} = ©. Since the empty set has no elements, the 
statement x E Ø is an example of a statement that is always false, no matter 
what x is. 


Another common notation for the empty set is based on the fact that any 
set can be named by listing its elements between braces. Since the empty 


set has no elements, we write nothing between the braces, like this: © = {}. 
Note that {©} is not correct notation for the empty set. Just as we saw 
earlier that 2 and {2} are not the same thing, © is not the same as {©}. The 
first is a set with no elements, whereas the second is a set with one element, 
that one element being ©, the empty set. 


Exercises 


*i. 
(a) 


(b) 
(c) 

2. 
(a) 
(b) 
(c) 
go 
(a) 
(b) 


(c) 
(d) 


Analyze the logical forms of the following statements: 


3 is a common divisor of 6, 9, and 15. (Note: You did this in exercise 
2 of Section 1.1, but you should be able to give a better answer now.) 
x is divisible by both 2 and 3 but not 4. 

x and y are natural numbers, and exactly one of them is prime. 


Analyze the logical forms of the following statements: 


x and y are men, and either x is taller than y or y is taller than x. 
Either x or y has brown eyes, and either x or y has red hair. 
Either x or y has both brown eyes and red hair. 


Write definitions using elementhood tests for the following sets: 


{Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus, Neptune}. 
{Brown, Columbia, Cornell, Dartmouth, Harvard, Princeton, 
University of Pennsylvania, Yale}. 

{Alabama, Alaska, Arizona, ..., Wisconsin, Wyoming}. 

{Alberta, British Columbia, Manitoba, New Brunswick, 
Newfoundland and Labrador, Northwest Territories, Nova Scotia, 
Nunavut, Ontario, Prince Edward Island, Quebec, Saskatchewan, 
Yukon}. 


Write definitions using elementhood tests for the following sets: 

{1, 4, 9, 16, 25, 36, 49, ...}. 

{1, 2, 4, 8, 16, 32, 64, ...}. 

{10, 11, 12, 13, 14, 15, 16, 17, 18, 19}. 

Simplify the following statements. Which variables are free and 
which are bound? If the statement has no free variables, say whether 
it is true or false. 


-3 E {x E R| 13 - 2x> 1}. 
4€{xER | 13- 2x> 1}. 
5 € {x © R|13- 2x > c}. 
Simplify the following statements. Which variables are free and 


which are bound? If the statement has no free variables, say whether it 
is true or false. 


w € {x € R| 13 - 2x > c}. 

4 E {x E R| 13 - 2x E {y | y is a prime number}}. (It might make 
this statement easier to read if we let P = {y | y is a prime number}; 
using this notation, we could rewrite the statement as 4 E {x E R | 
13 - 2x E P}.) 

4 E {x € {y | y is a prime number} | 13 - 2x > 1}. (Using the same 
notation as in part (b), we could write this as 4 E {x E P | 13 - 2x > 
1}.) 

List the elements of the following sets: 

{xE R|2x2+x-1=0}. 

{x E Rt | 2x2 +x-1=0}. 

{x © Z| 2x2 +x-1=0}. 

{xE N| 2x2 +x-1=0}. 

What are the truth sets of the following statements? List a few 
elements of the truth set if you can. 


Elizabeth Taylor was once married to x. 
x is a logical connective studied in Section 1.1. 
x is the author of this book. 


What are the truth sets of the following statements? List a few 
elements of the truth set if you can. 

x is areal number and x? - 4x+3=0. 

x is a real number and x? - 2 x +3 =0. 

x is a real number and 5 € {y € R | x? + y? < 50}. 


1.4 Operations on Sets 


Suppose A is the truth set of a statement P(x) and B is the truth set of Q(x). 
What are the truth sets of the statements P(x) A Q(x), P(x) V Q(x), and 
=P(x)? To answer these questions, we introduce some basic operations on 
sets. 


Definition 1.4.1. The intersection of two sets A and B is the set An B 
defined as follows: 


ANB={x|x € Aandx € B}. 
The union of A and B is the set A U B defined as follows: 
AUB={x|xeAorxe B}. 
The difference of A and B is the set A \ B defined as follows: 
A\ B={x|x ¢Aandx ¢ B}. 


Remember that the statements that appear in these definitions are 
element-hood tests. Thus, for example, the definition of A n B says that for 
an object to be an element of A N B, it must be an element of both A and B. 
In other words, A N B is the set consisting of the elements that A and B have 
in common. Because the word or is always interpreted as inclusive or in 
mathematics, anything that is an element of either A or B, or both, will be 
an element of A U B. Thus, we can think of A U B as the set resulting from 
throwing all the elements of A and B together into one set. A\ B is the set 
you would get if you started with the set A and removed from it any 
elements that were also in B. 


Example 1.4.2. Suppose A = {1, 2, 3, 4, 5} and B = {2, 4, 6, 8, 10}. List 
the elements of the following sets: 

1. ANB. 

2. AUB. 

3. A\B. 


4. (AUB)\(An B). 

5. (A\B) U (B\A). 
Solutions 

1. An B= {2, 4}. 

2. AU B= {1, 2, 3, 4, 5, 6, 8, 10}. 
3. A\B= {1,3,5}. 
4 


We have just computed A U B and A n B in solutions 1 and 2, so all 
we need to do is start with the set A U B from solution 2 and remove 
from it any elements that are also in A n B. The answer is (A U B) \ 
(A ^n B)= {1, 3, 5, 6, 8, 10}. 


5. We already have the elements of A \ B listed in solution 3, and B \ A = 
{6, 8, 10}. Thus, their union is (A \ B) U (B \ A) = {1, 3, 5, 6, 8, 10}. 
Is it just a coincidence that this is the same as the answer to part 4? 


Example 1.4.3. Suppose A = {x | x is a man} and B = {x | x has brown 
hair}. What are A N B, A U B, and A \ B? 


Solution 


By definition, A n B= {x | x E A and x € B}. As we saw in the last section, 
the definitions of A and B tell us that x E A means the same thing as “x is a 
man,” and x € B means the same thing as “x has brown hair.” Plugging this 
into the definition of A N B, we find that 


An B= {x| xis a man and x has brown hair}. 
Similar reasoning shows that 
A U B= {x | either x is a man or x has brown hair} 
and 
A \ B= {x | x is aman and x does not have brown hair}. 


Sometimes it is helpful when working with operations on sets to draw 
pictures of the results of these operations. One way to do this is with 


diagrams like that in Figure 1.7. This is called a Venn diagram. The interior 
of the rectangle enclosing the diagram represents the universe of discourse 
U, and the interiors of the two circles represent the two sets A and B. Other 
sets formed by combining these sets would be represented by different 
regions in the diagram. For example, the shaded region in Figure 1.8 is the 
region common to the interiors of the circles representing A and B, and so it 
represents the set A N B. Figures 1.9 and 1.10 show the regions representing 
A U Band A\B, respectively. 


Figure 1.7. 


Figure 1.8. A N B. 


Figure 1.9.A U B. 


Figure 1.10. A \ B. 


Here’s an example of how Venn diagrams can help us understand 
operations on sets. In Example 1.4.2 the sets (A U B) \ (A ^ B) and (A \ B) 
U (B \ A) turned out to be equal, for a particular choice of A and B. You can 
see by making Venn diagrams for both sets that this was not a coincidence. 
You’ll find that both Venn diagrams look like Figure 1.11. Thus, these sets 
will always be equal, no matter what the sets A and B are, because both sets 
will always be the set of objects that are elements of either A or B but not 
both. This set is called the symmetric difference of A and B and is written A 
A B. In other words, A A B = (A \ B) U (B\A) = (A U B) \ (A rn B). Later in 
this section we’ll see another explanation of why these sets are always 
equal. 


Figure 1.11. (A U B) \ (A ^ B) = (A \ B) U(B\A). 


Let’s return now to the question with which we began this section. If A is 
the truth set of a statement P(x) and B is the truth set of Q(x), then, as we 
saw in the last section, x E A means the same thing as P(x) and x E B 
means the same thing as Q(x). Thus, the truth set of P(x) A Q(x) is {x | P(x) 
A Q(x)} = {x |x © A A x © B} =A | B. This should make sense. It just 
says that the truth set of P(x) A Q(x) consists of those elements that the 


truth sets of P(x) and Q(x) have in common — in other words, the values of x 
that make both P(x) and Q(x) come out true. We have already seen an 
example of this. In Example 1.4.3 the sets A and B were the truth sets of the 
statements “x is a man” and “x has brown hair,” and A N B turned out to be 
the truth set of “x is a man and x has brown hair.” 


Similar reasoning shows that the truth set of P(x) V Q(x) is A U B. To 
find the truth set of =P(x), we need to talk about the universe of discourse 
U. The truth set of =P(x) will consist of those elements of the universe for 
which P(x) is false, and we can find this set by starting with U and 
removing from it those elements for which P(x) is true. Thus, the truth set 
of =P(x) is U\A. 

These observations about truth sets illustrate the fact that the set theory 
operations Nn, U, and \ are related to the logical connectives A, V, and ~. 
This shouldn’t be surprising, since after all the words and, or, and not 
appear in their definitions. (The word not doesn’t appear explicitly, but it’s 
there, hidden in the mathematical symbol € in the definition of the 
difference of two sets.) It is important to remember, though, that although 
the set theory operations and logical connectives are related, they are not 
interchangeable. The logical connectives can only be used to combine 
statements, whereas the set theory operations must be used to combine sets. 
For example, if A is the truth set of P(x) and B is the truth set of Q(x), then 
we can say that A N B is the truth set of P(x) A Q(x), but expressions such 
as A A Bor P(x) n Q(x) are completely meaningless and should never be 
used. 

The relationship between set theory operations and logical connectives 
also becomes apparent when we analyze the logical forms of statements 
about intersections, unions, and differences of sets. For example, according 
to the definition of intersection, to say that x E A n B means that x EA A x 
€ B. Similarly, to say that x E A U B means that x E A V x E B, and x E A 
\ B means x € A A x & B, or in other words x € A A -(x € B). We can 
combine these rules when analyzing statements about more complex sets. 


Example 1.4.4. Analyze the logical forms of the following statements: 


1. xEAn (BUCO). 
2, xEA\(BNO). 


3. x&(An BJU (An C). 
Solutions 


1. xEAn (BUC) 


is equivalent tox € A Ax € (BUC) (definition of N), 


which is equivalent tox € AA (xeBwvxecC) (definition of U). 


2. xEA\(BaCO) 


is equivalent to x € AA7(x € BAC) (definition of \), 


which is equivalent tox € A A—~(x€e BAx €C) (definition of N). 


3. xE (An B)U (An C) 


is equivalent to x € (AN B) vx e (ANC) (definition of U), 
which is equivalent to (x € AAx e B) V(x EAAX EC) 


(definition of N). 


Look again at the solutions to parts 1 and 3 of Example 1.4.4. You should 
recognize that the statements we ended up with in these two parts are 
equivalent. (If you don’t, look back at the distributive laws in Section 1.2.) 
This equivalence means that the statements x E A n (B U C) andx E (An 
B) U (An C) are equivalent. In other words, the objects that are elements of 
the set An(B U C) will be precisely the same as the objects that are 
elements of (A n B) U (A n C), no matter what the sets A, B, and C are. But 
recall that sets with the same elements are equal, so it follows that for any 
sets A, B, and C, A n (B U C) = (An B) U (An C). Another way to see this 
is with the Venn diagram in Figure 1.12. Our earlier Venn diagrams had two 
circles, because in previous examples only two sets were being combined. 
This Venn diagram has three circles, which represent the three sets A, B, 
and C that are being combined in this case. Although it is possible to create 
Venn diagrams for more than three sets, it is rarely done, because it cannot 
be done with overlapping circles. For more on Venn diagrams for more than 
three sets, see exercise 12. 


Figure 1.12. A n (B UC) = (A n B) U (A n ©). 


Thus, we see that a distributive law for logical connectives has led to a 
distributive law for set theory operations. You might guess that because 
there were two distributive laws for the logical connectives, with A and V 
playing opposite roles in the two laws, there might be two distributive laws 
for set theory operations too. The second distributive law for sets should say 
that for any sets A, B, and C, A U (Bn C) = (A U B) n (A U C). You can 
verify this for yourself by writing out the statements x E A U (B ^n C) and x 
€ (A U B) n (A U C) using logical connectives and verifying that they are 
equivalent, using the second distributive law for the logical connectives A 
and V. Another way to see it is to make a Venn diagram. 

We can derive another set theory identity by finding a statement 
equivalent to the statement we ended up with in part 2 of Example 1.4.4: 


xEA\(BNC) 


is equivalent tox E&A (Example 1.4.4), 
“(x& BAxEC) 


which is equivalent tox € A (De Morgan’s law), 
Nx EBVXxXEC) 


which is equivalent to (x © (distributive law), 
ANXEB)V(XEAAXE 
C) 


which is equivalent to (x © (definition of \), 
A\B)V (xEGA\C) 


which is equivalent tox © (definition of U). 
(A\ B) U (A\C) 


Thus, we have shown that for any sets A, B, and C, we have A \ (B ^n C) = 
(A\ B) U (A \ C). Once again, you can verify this with a Venn diagram as 
well. 

Earlier we promised an alternative way to check the identity (A U B)\(A 
n B)= (A \ B) U (B\A). You should see now how this can be done. First, we 
write out the logical forms of the statements x E (A U B) \ (A ^n B) andx E 
(A \ B) U (B\A): 


x €(AUB)\(ANB) means (x EAVXEB)AA(XEAAX EB); 


E 
xe(A\B)U(B\A)means(xxeAAxéB)v(xeBAXxXĖA). 


You can now check, using equivalences from Section 1.2, that these 
statements are equivalent. An alternative way to check the equivalence is 
with a truth table. To simplify the truth table, let’s use P and Q as 
abbreviations for the statements x € A and x € B. Then we must check that 
the formulas (P V Q) A~(P A Q)and (P A =Q) V (Q A =P) are equivalent. 
The truth table in Figure 1.13 shows this. 


P Q (PVQ)A-A(PAQ) (PAQOQ AP) 
F F F F 
F T T T 
T F T T 
T T F F 
Figure 1.13. 


Definition 1.4.5. Suppose A and B are sets. We will say that A is a subset of 
B if every element of A is also an element of B. We write A © B to mean 
that A is a subset of B. A and B are said to be disjoint if they have no 
elements in common. Note that this is the same as saying that the set of 
elements they have in common is the empty set, or in other words A n B = 
©. 


Example 1.4.6. Suppose A = {red, green}, B = {red, yellow, green, 
purple}, and C = {blue, purple}. Then the two elements of A, red and green, 


are both also in B, and therefore A € B. Also, A n C = ©, so A and C are 
disjoint. 
If we know that A © B, or that A and B are disjoint, then we might draw a 


Venn diagram for A and B differently to reflect this. Figures 1.14 and 1.15 
illustrate this. 


Figure 1.14. A © B. 


Figure 1.15. A n B= Ø. 


Just as we earlier derived identities showing that certain sets are always 
equal, it is also sometimes possible to show that certain sets are always 
disjoint, or that one set is always a subset of another. For example, you can 
see in a Venn diagram that the sets A n B and A \ B do not overlap, and 
therefore they will always be disjoint for any sets A and B. Another way to 
see this would be to write out what it means to say that x E (A n B) n (A \ 
B): 


x €(AN B)N(A\ B) means (X EAAXE B)A(XEAAX EB), 


which is equivalent tox € AA (x € BAX ¢ B). 


But this last statement is clearly a contradiction, so the statement x © 
(AnB)n(A\B) will always be false, no matter what x is. In other words, 
nothing can be an element of (ANB)n(A\B), so it must be the case that (A n 
B)n(A\B) = ©. Therefore, A n B and A \ B are disjoint. 

The next theorem gives another example of a general fact about set 
operations. The proof of this theorem illustrates that the principles of 
deductive reasoning we have been studying are actually used in 
mathematical proofs. 


Theorem 1.4.7. For any sets A and B, (A U B)\ BEA. 


Proof. We must show that if something is an element of (A U B) \ B, then it 
must also be an element of A, so suppose that x € (A U B) \ B. This means 
that x € A U B and x É B, or in other words x € A V x € B and x É B. But 
notice that these statements have the logical form P V Q and ~Q, and this is 
precisely the form of the premises of our very first example of a deductive 
argument in Section 1.1! As we saw in that example, from these premises 
we can conclude that x E A must be true. Thus, anything that is an element 
of (A U B) \ B must also be an element of A, so (A U B)\ B SA. 
L] 
You might think that such a careful application of logical laws is not 
needed to understand why Theorem 1.4.7 is correct. The set (A U B) \ B 
could be thought of as the result of starting with the set A, adding in the 
elements of B, and then removing them again. Common sense suggests that 
the result will just be the original set A; in other words, it appears that (A U 
B) \ B = A. However, as you are asked to show in exercise 10, this 
conclusion is incorrect. This illustrates that in mathematics, you must not 
allow imprecise reasoning to lead you to jump to conclusions. Applying 
laws of logic carefully, as we did in our proof of Theorem 1.4.7, may help 
you to avoid jumping to unwarranted conclusions. 


Exercises 


*1. LetA = {1, 3, 12, 35}, B = {3, 7, 12, 20}, and C = {x | x is a prime 
number}. List the elements of the following sets. Are any of the sets 


below disjoint from any of the others? Are any of the sets below 
subsets of any others? 


ANB. 

(A U B)\C. 

A U (B\C). 

Let A = {United States, Germany, China, Australia}, B = {Germany, 
France, India, Brazil}, and C = {x | x is a country in Europe}. List the 
elements of the following sets. Are any of the sets below disjoint from 
any of the others? Are any of the sets below subsets of any others? 


AUB. 
(An B)\C. 
(B n C)\A. 


Verify that the Venn diagrams for (A U B) \ (A ^ B) and (A \ B) U (B \ 
A) both look like Figure 1.11, as stated in this section. 

Use Venn diagrams to verify the following identities: 

A\ (A n B)=A\B. 

AU(BNC)=(AUB)nN (AUC). 

Verify the identities in exercise 4 by writing out (using logical 
symbols) what it means for an object x to be an element of each set 
and then using logical equivalences. 


Use Venn diagrams to verify the following identities: 

(A U B)\C=(A\C) U (B\C). 

AU (B\ C) = (A U B)\(C\A). 

Verify the identities in exercise 6 by writing out (using logical 
symbols) what it means for an object x to be an element of each set 
and then using logical equivalences. 

Use any method you wish to verify the following identities: 

(A\ B) n C=(An C)\B. 

(An B)\B=©. 

A\(A\B)=AnB. 

For each of the following sets, write out (using logical symbols) what 
it means for an object x to be an element of the set. Then determine 


(a) 
(b) 
(c) 
(d) 
(e) 
10. 


(a) 
(b) 
11. 


*12; 


(a) 


(b) 


13. 


(b) 


which of these sets must be equal to each other by determining which 
statements are equivalent. 


(A\ B) \ C. 

A\ (B\ C). 

(A B) U (An C). 

(A\ B) n (A\C). 

A\ (BU C). 

It was shown in this section that for any sets A and B, (A U B)\B& 
A. 


Give an example of two sets A and B for which (A U B) \ B =A. 
Show that for all sets A and B, (A U B)\ B=A\B. 


Suppose A and B are sets. Is it necessarily true that (A \ B) U B = A? If 
not, is one of these sets necessarily a subset of the other? Is (A \ B) U 
B always equal to either A \ B or A U B? 

It is claimed in this section that you cannot make a Venn diagram for 
four sets using overlapping circles. 


What’s wrong with the following diagram? (Hint: Where’s the set (A 
n D)\(B UC) ?) 


Can you make a Venn diagram for four sets using shapes other than 
circles? 


(a) Make Venn diagrams for the sets (A U B) \ C and A U (B \ C). 
What can you conclude about whether one of these sets is 
necessarily a subset of the other? 


Give an example of sets A, B, and C for which (A U B)\C4#A uU (B \ 
C). 


*14. Use Venn diagrams to show that the associative law holds for 

symmetric difference; that is, for any sets A, B, and C, A(B A C) = (A 
A B) C. 

15. Use any method you wish to verify the following identities: 

(a) (AAB)UC=(AUC)A(B\C). 

(b) (AA B)NC=(ANC)A(BNC). 

(c) (AA B)\C=(A\C)A(B\C). 

16. Use any method you wish to verify the following identities: 

(a) (AUB)AC=(AAC)A(B\A). 

(b) (A NB)AC =(AAC)A(A \ B). 

(©) (A\B)AC=(AAC)A(ANB). 

17. Fill in the blanks to make true identities: 

(a) (AAB)NC=(C\A)A 

(b) C\(AAB)=(ANC)A 

(c) (B\A)AC=(AAC)A 


1.5 The Conditional and Biconditional 
Connectives 


It is time now to return to a question we left unanswered in Section 1.1. We 
have seen how the reasoning in the first and third arguments in Example 
1.1.1 can be understood by analyzing the connectives V and ~. But what 
about the reasoning in the second argument? Recall that the argument went 
like this: 


If today is Sunday, then I don’t have to go to work today. 


Today is Sunday. 
Therefore, I don’t have to go to work today. 


What makes this reasoning valid? 

It appears that the crucial words here are if and then, which occur in the 
first premise. We therefore introduce a new logical connective, >, and 
write P > Q to represent the statement “If P then Q.” This statement is 
sometimes called a conditional statement, with P as its antecedent and Q as 


its consequent. If we let P stand for the statement “Today is Sunday” and Q 
for the statement “I don’t have to go to work today,” then the logical form 
of the argument would be 


P+@Q 

P 

~Q 
Our analysis of the new connective — should lead to the conclusion that 
this argument is valid. 


Example 1.5.1. Analyze the logical forms of the following statements: 


1. If it’s raining and I don’t have my umbrella, then PII get wet. 


2. If Mary did her homework, then the teacher won’t collect it, and if she 
didn’t, then he’ll ask her to do it on the board. 


Solutions 


1. Let R stand for the statement “It’s raining,” U for “I have my 
umbrella,” and W for “I'll get wet.” Then statement 1 would be 
represented by the formula (R A =U) > W. 


2. Let H stand for “Mary did her homework,” C for “The teacher will 
collect it,” and B for “The teacher will ask Mary to do the homework 
on the board.” Then the given statement means (H > =C) A (=H > 
B). 


To analyze arguments containing the connective > we must work out the 
truth table for the formula P > Q. Because P > Q is supposed to mean that 
if P is true then Q is also true, we certainly want to say that if P is true and 
Q is false then P > Q is false. If P is true and Q is also true, then it seems 
reasonable to say that P > Q is true. This gives us the last two lines of the 
truth table in Figure 1.16. The remaining two lines of the truth table are 
harder to fill in, although probably most people would say that if P and Q 
are both false then P > Q should be considered true. Thus, we can sum up 
our conclusions so far with the table in Figure 1.16. 


P+Q 


Sant tt 


Q 
F 
T ? 
F 
T 
Figure 1.16. 


To help us fill in the undetermined lines in this truth table, let’s look at an 
example. Consider the statement “If x > 2 then x? > 4,” which we could 
represent with the formula P(x) > Q(x), where P(x) stands for the statement 
x > 2 and Q(x) stands for x* > 4. Of course, the statements P(x) and Q(x) 
contain x as a free variable, and each will be true for some values of x and 
false for others. But surely, no matter what the value of x is, we would say it 
is true that if x > 2 then x°? > 4, so the conditional statement P(x) > Q(x) 
should be true. Thus, the truth table should be completed in such a way that 
no matter what value we plug in for x, this conditional statement comes out 
true. 


For example, suppose x = 3. In this case x > 2 and x? = 9 > 4, so P(x) and 
Q(x) are both true. This corresponds to line four of the truth table in Figure 
1.16, and we’ve already decided that the statement P(x) > Q(x) should 
come out true in this case. But now consider the case x = 1. Then x < 2 and 
x? = 1 < 4, so P(x) and Q(x) are both false, corresponding to line one in the 
truth table. We have tentatively placed a T in this line of the truth table, and 
now we see that this tentative choice must be right. If we put an F there, 
then the statement P(x) > Q(x) would come out false in the case x = 1, and 
we’ve already decided that it should be true for all values of x. 


Finally, consider the case x = —5. Then x < 2, so P(x) is false, but x? = 25 
> 4, so Q(x) is true. Thus, in this case we find ourselves in the second line 
of the truth table, and once again, if the conditional statement P(x) > Q(x) 
is to be true in this case, we must put a T in this line. So it appears that all 
the questionable lines in the truth table in Figure 1.16 must be filled in with 
T’s, and the completed truth table for the connective > must be as shown in 
Figure 1.17. 


P Q P+@Q 

F F T 

F T T 

T F F 

T T T 
Figure 1.17. 


Of course, there are many other values of x that could be plugged into our 
statement “If x > 2 then x? > 4”; but if you try them, yov’ll find that they all 
lead to line one, two, or four of the truth table, as our examples x = 1, —5, 
and 3 did. No value of x will lead to line three, because you could never 
have x > 2 but x? < 4. After all, that’s why we said that the statement “If x > 
2 then x? > 4” was always true, no matter what x was! The point of saying 
that this conditional statement is always true is simply to say that you will 
never find a value of x such that x > 2 and x? < 4 — in other words, there is 
no value of x for which P(x) is true but Q(x) is false. Thus, it should make 
sense that in the truth table for P > Q, the only line that is false is the line 
in which P is true and Q is false. 

As the truth table in Figure 1.18 shows, the formula ~P V Q is also true 
in every case except when P is true and Q is false. Thus, if we accept the 
truth table in Figure 1.17 as the correct truth table for the formula P > Q, 
then we will be forced to accept the conclusion that the formulas P > Q 
and =P V Q are equivalent. Is this consistent with the way the words if and 
then are used in ordinary language? It may not seem to be at first, but, at 
least for some uses of the words if and then, it is. 


P Q =PVQ 

F F T 

F T T 

T F F 

T Ff T 
Figure 1.18. 


For example, imagine a teacher saying to a class, in a threatening tone of 
voice, “You won’t neglect your homework, or you’ll fail the course.” 
Grammatically, this statement has the form ~P V Q, where P is the 


statement “You will neglect your homework” and Q is “You’ll fail the 
course.” But what message is the teacher trying to convey with this 
statement? Clearly the intended message is “If you neglect your homework, 
then you’|l fail the course,” or in other words P > Q. Thus, in this example, 
the formulas ~P V Q and P > Q seem to mean the same thing. 

There is a similar idea at work in the first statement from Example 1.1.2, 
“Either John went to the store, or we’re out of eggs.” In Section 1.1 we 
represented this statement by the formula P V Q, with P standing for “John 
went to the store” and Q for “We’re out of eggs.” But someone who made 
this statement would probably be trying to express the idea that if John 
didn’t go to the store, then we’re out of eggs, or in other words ~P > Q. 
Thus, this example suggests that ~P > Q means the same thing as P V Q. 
In fact, we can derive this equivalence from the previous one by 
substituting ~P for P. Because P > Q is equivalent to =P V Q, it follows 
that ~P > Q is equivalent to -=P V Q, which is equivalent to P V Q by the 
double negation law. 

We can derive another useful equivalence as follows: 


=P v Q is equivalent to —P v --Q (double negation law), 


which is equivalent to =(P A ~Q) (De Morgan’s law). 


Thus, P > Q is also equivalent to -(P A ~Q). In fact, this is precisely the 
conclusion we reached earlier when discussing the statement “If x > 2 then 
x? > 4.” We decided then that the reason this statement is true for every 
value of x is that there is no value of x for which x > 2 and x? < 4. In other 
words, the statement P(x) A 7=Q(x) is never true, where as before P(x) 
stands for x > 2 and Q(x) for x? > 4. But that’s the same as saying that the 
statement —=(P(x) A =Q(x)) is always true. Thus, to say that P(x) > Q(x) is 
always true means the same thing as saying that =(P(x) A =Q(x)) is always 
true. 

For another example of this equivalence, consider the statement “If it’s 
going to rain, then Pll take my umbrella.” Of course, this statement has the 
form P > Q, where P stands for the statement “It’s going to rain” and Q 
stands for “Pll take my umbrella.” But we could also think of this statement 
as a declaration that I won’t be caught in the rain without my umbrella — in 
other words, =(P A ~Q). 


To summarize, so far we have discovered the following equivalences 
involving conditional statements: 


Conditional laws 


P — Q is equivalent to —P v Q. 
P — Q is equivalent to =(P A ~Q). 


In case yov’re still not convinced that the truth table in Figure 1.17 is 
right, we give one more reason. We know that, using this truth table, we can 
now analyze the validity of deductive arguments involving the words if and 
then. We’ll find, when we analyze a few simple arguments, that the truth 
table in Figure 1.17 leads to reasonable conclusions about the validity of 
these arguments. But if we were to make any changes in the truth table, we 
would end up with conclusions that are clearly incorrect. For example, let’s 
return to the argument form with which we started this section: 

Pg 

P 

"Q 
We have already decided that this form of argument should be valid, and the 
truth table in Figure 1.19 confirms this. The premises are both true only in 
line four of the table, and in this line the conclusion is true as well. 


Premises Conclusion 
P Q@ P-@Q P Q 
F F T F F 
F T T F T 
T F F T F 
T T T T T 
Figure 1.19. 


You can also see from Figure 1.19 that both premises are needed to make 
this argument valid. But if we were to change the truth table for the 
conditional statement to make P > Q false in the first line of the table, then 
the second premise of this argument would no longer be needed. We would 
end up with the conclusion that, just from the single premise P > Q, we 


could infer that Q must be true, since in the two lines of the truth table in 
which the premise P > Q would still be true, lines two and four, the 
conclusion Q is true too. But this doesn’t seem right. Just knowing that if P 
is true then Q is true, but not knowing that P is true, it doesn’t seem 
reasonable that we should be able to conclude that Q is true. For example, 
suppose we know that the statement “If John didn’t go to the store then 
we’re out of eggs” is true. Unless we also know whether or not John has 
gone to the store, we can’t reach any conclusion about whether or not we’re 
out of eggs. Thus, changing the first line of the truth table for P > Q would 
lead to an incorrect conclusion about the validity of an argument. 

Changing the second line of the truth table would also lead to 
unacceptable conclusions about the validity of arguments. To see this, 
consider the argument form: 


P>@Q 
Q 
oe 


This should not be considered a valid form of reasoning. For example, 
consider the following argument, which has this form: 


If Jones was convicted of murdering Smith, then he will go to jail. 
Jones will go to jail. 
Therefore, Jones was convicted of murdering Smith. 


Even if the premises of this argument are true, the conclusion that Jones 
was convicted of murdering Smith doesn’t follow. Maybe the reason he will 
go to jail is that he robbed a bank or cheated on his income tax. Thus, the 
conclusion of this argument could be false even if the premises were true, 
so the argument isn’t valid. 

The truth table analysis in Figure 1.20 agrees with this conclusion. In line 
two of the table, the conclusion P is false, but both premises are true, so the 
argument is invalid. But notice that if we were to change the truth table for 
P > Q and make it false in line two, then the truth table analysis would say 
that the argument is valid. Thus, the analysis of this argument seems to 
support our decision to put a T in the second line of the truth table for P — 


Q. 


Premises Conclusion 


P Q P>Q Q P 

F F T F F 

F T T T F 

T F F F T 

T T T T T 
Figure 1.20. 


The last example shows that from the premises P > Q and Q it is 
incorrect to infer P. But it would certainly be correct to infer P from the 
premises Q > P and Q. This shows that the formulas P > Q and Q > P do 
not mean the same thing. You can check this by making a truth table for 
both and verifying that they are not equivalent. For example, a person might 
believe that, in general, the statement “If you are a convicted murderer then 
you are untrustworthy” is true, without believing that the statement “If you 
are untrustworthy then you are a convicted murderer” is generally true. The 
formula Q > P is called the converse of P > Q. It is very important to 
make sure you never confuse a conditional statement with its converse. 

The contrapositive of P > Q is the formula ~Q -> -P, and it is 
equivalent to P > Q. This may not be obvious at first, but you can verify it 
with a truth table. For example, the statements “If John cashed the check I 
wrote then my bank account is overdrawn” and “If my bank account isn’t 
overdrawn then John hasn’t cashed the check I wrote” are equivalent. I 
would be inclined to assert both in exactly the same circumstances — 
namely, if the check I wrote was for more money than I had in my account. 
The equivalence of conditional statements and their contrapositives is used 
often in mathematical reasoning. We add it to our list of important 
equivalences: 


Contrapositive law 
P = Q is equivalent to =Q > ~P. 
Example 1.5.2. Which of the following statements are equivalent? 


1. If it’s either raining or snowing, then the game has been canceled. 


2. If the game hasn’t been canceled, then it’s not raining and it’s not 
snowing. 


3. Ifthe game has been canceled, then it’s either raining or snowing. 
4. If it’s raining then the game has been canceled, and if it’s snowing 
then the game has been canceled. 
5. If it’s neither raining nor snowing, then the game hasn’t been 
canceled. 
Solution 


We translate all of the statements into the notation of logic, using the 
following abbreviations: R stands for the statement “It’s raining,” S stands 
for “It’s snowing,” and C stands for “The game has been canceled.” 


1. 
2. 


(RvS) > C. 


=C > (aR A aS). By one of De Morgan’s laws, this is equivalent to 
=C > -(R V S). This is the contrapositive of statement 1, so they are 
equivalent. 


C > (R V S). This is the converse of statement 1, which is not 
equivalent to it. You can verify this with a truth table, or just think 
about what the statements mean. Statement 1 says that rain or snow 
would result in cancelation of the game. Statement 3 says that these 
are the only circumstances in which the game will be canceled. 


(R > C) A (S > C). This is also equivalent to statement 1, as the 
following reasoning shows: 
(R > C) A(S > C) 


is equivalent to (~R v C) A (>S v C) (conditional law), 


which is equivalent to (~R A >S) v C (distributive law), 
which is equivalent to -(R v S) v C (De Morgan’s law), 
which is equivalent to (R v S) > C (conditional law). 


You should read statements 1 and 4 again and see if it makes sense to 
you that they’re equivalent. 


-(R V S) > ~C. This is the contrapositive of statement 3, so they are 
equivalent. It is not equivalent to statements 1, 2, and 4. 


Statements that mean P > Q come up very often in mathematics, but 
sometimes they are not written in the form “If P then Q.” Here are a few 
other ways of expressing the idea P > Q that are used often in 
mathematics: 


P implies Q. 

Q, if P. 

P only if Q. 

P is a sufficient condition for Q. 
Q is a necessary condition for P. 


Some of these may require further explanation. The second expression, 
“Q, if P,” is just a slight rearrangement of the statement “If P then Q,” so it 
should make sense that it means P > Q. As an example of a statement of 
the form “P only if Q,” consider the sentence “You can run for president 
only if you are a citizen.” In this case, P is “You can run for president” and 
Q is “You are a citizen.” What the statement means is that if you’re not a 
citizen, then you can’t run for president, or in other words ~Q > ~P. But 
by the contrapositive law, this is equivalent to P > Q. 

Think of “P is a sufficient condition for Q” as meaning “The truth of P 
suffices to guarantee the truth of Q,” and it should make sense that this 
should be represented by P > Q. Finally, “Q is a necessary condition for P 
“means that in order for P to be true, it is necessary for Q to be true also. 
This means that if Q isn’t true, then P can’t be true either, or in other words, 
=Q > ~P. Once again, by the contrapositive law we get P > Q. 


Example 1.5.3. Analyze the logical forms of the following statements: 


1. If at least ten people are there, then the lecture will be given. 
2. The lecture will be given only if at least ten people are there. 
3. The lecture will be given if at least ten people are there. 
4 


Having at least ten people there is a sufficient condition for the lecture 
being given. 


5. Having at least ten people there is a necessary condition for the 
lecture being given. 


Solutions 


Let T stand for the statement “At least ten people are there” and L for “The 
lecture will be given.” 


l PL, 


2. L > T. The given statement means that if there are not at least ten 
people there, then the lecture will not be given, or in other words ~T 
> «AL. By the contrapositive law, this is equivalent to L > T. 


3. T > L. This is just a rephrasing of statement 1. 


4. T > L. The statement says that having at least ten people there 
suffices to guarantee that the lecture will be given, and this means that 
if there are at least ten people there, then the lecture will be given. 


5. L > T. This statement means the same thing as statement 2: If there 
are not at least ten people there, then the lecture will not be given. 


We have already seen that a conditional statement P > Q and its 
converse Q > P are not equivalent. Often in mathematics we want to say 
that both P > Q and Q > P are true, and it is therefore convenient to 
introduce a new connective symbol, + , to express this. You can think of P 
= Q as just an abbreviation for the formula (P > Q) A (Q > P). A 
statement of the form P e- Q is called a biconditional statement, because it 
represents two condi-tional statements. By making a truth table for (P > Q) 
A (Q > P) you can verify that the truth table for P e Q is as shown in 
Figure 1.21. Note that, by the contrapositive law, P e Q is also equivalent 


to (P > Q) ACP > =Q). 


P Q Ps@Q 

F F T 

F T F 

T F F 

T T T 
Figure 1.21. 


Because Q > P can be written “P if Q” and P > Q can be written “P 
only if Q,” P + Q means “P if Q and P only if Q,” and this is often written 


“P if and only if Q.” The phrase if and only if occurs so often in 
mathematics that there is a common abbreviation for it, iff. Thus, P e Q is 
often written “P iff Q.” Another statement that means P ~ Q is “Pisa 
necessary and sufficient condition for Q.” 


Example 1.5.4. Analyze the logical forms of the following statements: 


1. The game will be canceled iff it’s either raining or snowing. 


2. Having at least ten people there is a necessary and sufficient condition 
for the lecture being given. 


3. If John went to the store then we have some eggs, and if he didn’t 
then we don’t. 


Solutions 


1. Let C stand for “The game will be canceled,” R for “It’s raining,” and 
S for “It’s snowing.” Then the statement would be represented by the 
formula Ce (R V S). 


2. Let T stand for “There are at least ten people there” and L for “The 
lecture will be given.” Then the statement means T e L. 


3. Let S stand for “John went to the store” and E for “We have some 
eggs.” Then a literal translation of the given statement would be (S > 
E) A (œS > aE). This is equivalent to S e- E. 


One of the reasons it’s so easy to confuse a conditional statement with its 
converse is that in everyday speech we sometimes use a conditional 
statement when what we mean to convey is actually a biconditional. For 
example, you probably wouldn’t say “The lecture will be given if at least 
ten people are there” unless it was also the case that if there were fewer than 
ten people, the lecture wouldn’t be given. After all, why mention the 
number ten at all if it’s not the minimum number of people required? Thus, 
the statement actually suggests that the lecture will be given iff there are at 
least ten people there. For another example, suppose a child is told by his 
parents, “If you don’t eat your dinner, you won’t get any dessert.” The child 
certainly expects that if he does eat his dinner, he will get dessert, although 
that’s not literally what his parents said. In other words, the child interprets 


the statement as meaning “Eating your dinner is a necessary and sufficient 
condition for getting dessert.” 

Such a blurring of the distinction between if and iff is never acceptable in 
mathematics. Mathematicians always use a phrase such as iff or necessary 
and sufficient condition when they want to express a_biconditional 
statement. You should never interpret an if-then statement in mathematics 
as a biconditional statement, the way you might in everyday speech. 


Exercises 


me 
(a) 


Analyze the logical forms of the following statements: 


If this gas either has an unpleasant smell or is not explosive, then it 
isn’t hydrogen. 

Having both a fever and a headache is a sufficient condition for 
George to go to the doctor. 

Both having a fever and having a headache are sufficient conditions 
for George to go to the doctor. 

If x # 2, then a necessary condition for x to be prime is that x be odd. 


Analyze the logical forms of the following statements: 


Mary will sell her house only if she can get a good price and find a 
nice apartment. 

Having both a good credit history and an adequate down payment is a 
necessary condition for getting a mortgage. 

John will drop out of school, unless someone stops him. (Hint: First 
try to rephrase this using the words if and then instead of unless.) 

If x is divisible by either 4 or 6, then it isn’t prime. 


Analyze the logical form of the following statement: 


If it is raining, then it is windy and the sun is not shining. Now 
analyze the following statements. Also, for each statement determine 
whether the statement is equivalent to either statement (a) or its 
converse. 

It is windy and not sunny only if it is raining. 

Rain is a sufficient condition for wind with no sunshine. 

Rain is a necessary condition for wind with no sunshine. 


(e) 
(f) 


(g) 


*4, 


(a) 


(b) 


(c) 


(b) 


6. 
(b) 


It’s not raining, if either the sun is shining or it’s not windy. 

Wind is a necessary condition for it to be rainy, and so is a lack of 
sunshine. 

Either it is windy only if it is raining, or it is not sunny only if it is 
raining. 

Use truth tables to determine whether or not the following arguments 
are valid: 


Either sales or expenses will go up. If sales go up, then the boss will 
be happy. If expenses go up, then the boss will be unhappy. 
Therefore, sales and expenses will not both go up. 

If the tax rate and the unemployment rate both go up, then there will 
be a recession. If the GDP goes up, then there will not be a recession. 
The GDP and taxes are both going up. Therefore, the unemployment 
rate is not going up. 

The warning light will come on if and only if the pressure is too high 
and the relief valve is clogged. The relief valve is not clogged. 
Therefore, the warning light will come on if and only if the pressure 
is too high. 


Use truth tables to determine whether or not the following arguments 
are valid: 


If Jones is convicted then he will go to prison. Jones will be convicted 
only if Smith testifies against him. Therefore, Jones won’t go to 
prison unless Smith testifies against him. 

Either the Democrats or the Republicans will have a majority in the 
Senate, but not both. Having a Democratic majority is a necessary 
condition for the bill to pass. Therefore, if the Republicans have a 
majority in the Senate then the bill won’t pass. 


(a) Show that P - Q is equivalent to (P A Q) V (=P A aQ). 
Show that (P > Q) v (P > R) is equivalent to P > (Q V R). 


*7, (a) Show that (P > R) A (Q > R) is equivalent to (P V Q) > R. 


(b) 


8. 


Formulate and verify a similar equivalence involving (P > R) V (Q 
> R). 
(a) Show that (P > Q) A (Q > R)is equivalent to (P > R) A [(P = 
Q)V (R + Q)]. 


(b) 
mo). 


10. 


11. 


(b) 
12. 


(a) 
(b) 
(c) 
(d) 
(e) 


Show that (P > Q) vV (Q > R)is a tautology. 

Find a formula involving only the connectives = and > that is 
equivalent to P A Q. 

Find a formula involving only the connectives = and > that is 
equivalent to P = Q. 

(a) Show that (P V Q) - Qis equivalent to P > Q. 

Show that (P A Q) - Q is equivalent to Q > P. 

Which of the following formulas are equivalent? 

P >= (Q >R). 

Q> (P - R). 

(P > Q) A (P > R). 

(P KQ) >R. 

P => (QAR). 


2 


Quantificational Logic 


2.1 Quantifiers 


We have seen that a statement P(x) containing a free variable x may be true 
for some values of x and false for others. Sometimes we want to say 
something about how many values of x make P(x) come out true. In 
particular, we often want to say either that P(x) is true for every value of x 
or that it is true for at least one value of x. We therefore introduce two more 
symbols, called quantifiers, to help us express these ideas. 

To say that P(x) is true for every value of x in the universe of discourse 
U, we will write VxP(x). This is read “For all x, P(x).” Think of the upside 
down A as standing for the word all. The symbol V is called the universal 
quantifier, because the statement VxP(x) says that P(x) is universally true. 
As we discussed in Section 1.3, to say that P(x) is true for every value of x 
in the universe means that the truth set of P(x) will be the whole universe U. 
Thus, you could also think of the statement VxP(x) as saying that the truth 
set of P(x) is equal to U. 


We write 4xP(x) to say that there is at least one value of x in the universe 
for which P(x) is true. This is read “There exists an x such that P(x).” The 
backward E comes from the word exists and is called the existential 
quantifier. Once again, you can interpret this statement as saying something 
about the truth set of P(x). To say that P(x) is true for at least one value of x 
means that there is at least one element in the truth set of P(x), or in other 
words, the truth set is not equal to Ø. 

For example, in Section 1.5 we discussed the statement “If x > 2 then x? 
> 4,” where x ranges over the set of all real numbers, and we claimed that 
this statement was true for all values of x. We can now write this claim 
symbolically as Vx(x > 2 > x? > 4). 


Example 2.1.1. What do the following formulas mean? Are they true or 


false? 

1. Wx(x? > 0), where the universe of discourse is R, the set of all real 
numbers. 

2. Ax(x* -2x + 3 = 0), with universe R again. 

3. Adx(M(x) A B(x)), where the universe of discourse is the set of all 
people, M(x) stands for the statement “x is a man,” and B(x) means “x 
has brown hair.” 

4. Wx(M(x) > B(x)), with the same universe and the same meanings for 
M(x) and B(x). 

5. WxL(x, y), where the universe is the set of all people, and L(x, y) 
means “x likes y.” 

Solutions 

1. This means that for every real number x, x? > 0. This is true. 

2. This means that there is at least one real number x that makes the 
equation x? — 2x + 3 = 0 come out true. In other words, the equation 
has at least one real solution. If you solve the equation, you’ll find 
that this statement is false; the equation has no real solutions. (Try 
either completing the square or using the quadratic formula.) 

3. There is at least one person x such that x is a man and x has brown 
hair. In other words, there is at least one man who has brown hair. Of 
course, this is true. 

4. For every person x, if x is a man then x has brown hair. In other 


words, all men have brown hair. If you’re not convinced that this is 
what the formula means, it might help to look back at the truth table 
for the conditional connective. According to this truth table, the 
statement M(x) > B(x) will be false only if M(x) is true and B(x) is 
false; that is, x is a man and x doesn’t have brown hair. Thus, to say 
that M(x) > B(x) is true for every person x means that this situation 
never occurs, or in other words, that there are no men who don’t have 
brown hair. But that’s exactly what it means to say that all men have 
brown hair. Of course, this statement is false. 


5. For every person x, x likes y. In other words, everyone likes y. We 
can’t tell if this is true or false unless we know who y is. 


Notice that in the fifth statement in this example, we needed to know 
who y was to determine if the statement was true or false, but not who x 
was. The statement says that everyone likes y, and this is a statement about 
y, but not x. This means that y is a free variable in this statement but x is a 
bound variable. 

Similarly, although all the other statements contain the letter x, we didn’t 
need to know the value of x to determine their truth values, so x is a bound 
variable in every case. In general, even if x is a free variable in some 
statement P(x), it is a bound variable in the statements VxP(x) and AxP(x). 
For this reason, we say that the quantifiers bind a variable. As in Section 
1.3, this means that a variable that is bound by a quantifier can always be 
replaced with a new variable without changing the meaning of the 
Statement, and it is often possible to paraphrase the statement without 
mentioning the bound variable at all. For example, the statement VxL(x, y) 
from Example 2.1.1 is equivalent to VwL(w, y), because both mean the 
same thing as “Everyone likes y.” Words such as everyone, someone, 
everything, or something are often used to express the meanings of 
statements containing quantifiers. If you are translating an English 
statement into symbols, these words will often tip you off that a quantifier 
will be needed. 


As with the symbol ~, we follow the convention that the expressions Vx 
and 4x apply only to the statements that come immediately after them. For 
example, VxP(x) > Q(x) means (VxP(x)) > Q(x), not Vx(P(x) > Q(x)). 


Example 2.1.2. Analyze the logical forms of the following statements. 


Someone didn’t do the homework. 

Everything in that store is either overpriced or poorly made. 
Nobody’s perfect. 

Susan likes everyone who dislikes Joe. 

ACB. 

ANBECB\C. 


So ee eS 


Solutions 


1. The word someone tips us off that we should use an existential 
quantifier. As a first step, we write 4x(x didn’t do the homework). 
Now if we let H(x) stand for the statement “x did the homework,” 
then we can rewrite this as 4x7H(x). 


2. Think of this statement as saying “If it’s in that store, then it’s either 
overpriced or poorly made (no matter what it is).” Thus, we start by 
writing Vx(if x is in that store then x is either overpriced or poorly 
made). To write the part in parentheses symbolically, we let S(x) stand 
for “x is in that store,” O(x) for “x is overpriced,” and P(x) for “x is 
poorly made.” Then our final answer is Vx[S(x) > (O(x) V P(x))]. 


Note that, like statement 4 in Example 2.1.1, this statement has the 
form of a universal quantifier applied to a conditional statement. This 
form occurs quite often, and it is important to learn to recognize what 
it means and when it should be used. We can check our answer to this 
problem as we did before, by using the truth table for the conditional 
connective. The only way that the statement S(x) > (O(x) v P(x)) can 
be false is if x is in that store, but is neither overpriced nor poorly 
made. Thus, to say that the statement is true for all values of x means 
that this never happens, which is exactly what it means to say that 
everything in that store is either overpriced or poorly made. 


3. This means (somebody is perfect), or in other words ~AxP(x), where 
P(x) stands for “x is perfect.” 


4. As in statement 2 in this example, we could think of this as meaning 
“If a person dislikes Joe then Susan likes that person (no matter who 
the person is).” Thus, we can start by rewriting the given statement as 
Vx(if x dislikes Joe then Susan likes x). Let L(x, y) stand for “x likes 
y.” In statements that talk about specific elements of the universe of 
discourse it is sometimes convenient to introduce letters to stand for 
those specific elements. In this case we need to talk about Joe and 
Susan, so let’s let j stand for Joe and s for Susan. Thus, we can write 
L(s, x) to mean “Susan likes x,” and =L(x, j) for “x dislikes Joe.” 
Filling these in, we end up with the answer Vx(-L(x, j) > L(s, x)). 
Notice that, once again, we have a universal quantifier applied to a 


conditional statement. As before, you can check this answer using the 
truth table for the conditional connective. 


5. According to Definition 1.4.5, to say that A is a subset of B means 
that everything in A is in B. If you’ve caught on to the pattern of how 
universal quantifiers and conditionals are combined, you should 
recognize that this would be written symbolically as Vx(x E A > x E€ 
B). 


6. As in the previous statement, we first write this as Vx(x € An B > x 
€E B\ C). Now using the definitions of intersection and difference, we 
can expand this further to get Vx[(x € AN x EB) - (XE BAXE 
C)]. 


Although all of our examples so far have contained only one quantifier, 
there’s no reason why a statement can’t have more than one quantifier. For 
example, consider the statement “Some students are married.” The word 
some indicates that this statement should be written using an existential 
quantifier, so we can think of it as having the form x(x is a student and x is 
married). Let S(x) stand for “x is a student.” We could similarly choose a 
letter to stand for “x is married,” but perhaps a better analysis would be to 
recognize that to be married means to be married to someone. Thus, if we 
let M(x, y) stand for “x is married to y,” then we can write “x is married” as 
4yM(x, y). We can therefore represent the entire statement by the formula 
4x(S(x) A dyM(x, y)), a formula containing two existential quantifiers. 

As another example, let’s analyze the statement “All parents are 
married.” We start by writing it as Vx(if x is a parent then x is married). 
Parenthood, like marriage, is a relationship between two people; to be a 
parent means to be a parent of someone. Thus, it might be best to represent 
the statement “x is a parent” by the formula dyP(x, y), where P(x, y) means 
“x is a parent of y.” If we again represent “x is married” by the formula 
4yM(x, y), then our analysis of the original statement will be Vx(AyP(x, y) 
> dyM(x, y)). Although this isn’t wrong, the double use of the variable y 
could cause confusion. Perhaps a better solution would be to replace the 
formula SyM(x, y) with the equivalent formula 4zM(x, z). (Recall that these 
are equivalent because a bound variable in any statement can be replaced by 
another without changing the meaning of the statement.) Our improved 
analysis of the statement would then be Vx(AyP(x, y) > AzM(x,z)). 


A common mistake made by beginners is to leave out quantifiers. For 
example, you might be tempted to represent the statement “All parents are 
married” incorrectly by the formula Vx(P(x, y) > M(x, z)), leaving out dy 
and dz. A good way to catch such mistakes is to pay attention to free and 
bound variables. In the incorrect formula, there are no quantifiers binding 
the variables y and z, so y and z are free variables. But the original 
statement, “All parents are married,” is not a statement about y and z, so 
these variables should not be free in the answer. This is a tip-off that 
quantifiers on y and z are missing. Note that if we translate the incorrect 
formula Vx(P(x, y) > M(x, z)) back into English, we get a statement about 
y and z: “Everyone who is a parent of y is married to z.” 


Example 2.1.3. Analyze the logical forms of the following statements. 


1. Everybody in the dorm has a roommate he or she doesn’t like. 
2. Nobody likes a sore loser. 


3. Anyone who has a friend who has the measles will have to be 
quarantined. 


4. If anyone in the dorm has a friend who has the measles, then everyone 
in the dorm will have to be quarantined. 


5. IfA & B, then A and C \ Bare disjoint. 
Solutions 


1. This means Vx(if x lives in the dorm then x has a roommate he or she 
doesn’t like). To say that x has a roommate he or she doesn’t like, we 
could write Jy(x and y are roommates and x doesn’t like y). If we let 
R(x, y) stand for “x and y are roommates” and L(x, y) for “x likes y,” 
then this becomes Ay(R(x, y) A =L(x, y)). Finally, if we let D(x) mean 
“x lives in the dorm,” then the complete analysis of the original 
statement would be Vx[D(x) > dy(R(x, y) A =L(x, y))]. 


2. This is tricky, because the phrase a sore loser doesn’t refer to a 
particular sore loser, it refers to all sore losers. The statement means 
that all sore losers are disliked, or in other words Vx(if x is a sore 
loser then nobody likes x). To say nobody likes x we write 
—(somebody likes x), which means =AyL(y, x), where L(y, x) means “y 


(ii) 


(ii) 


(iii) 
5. 


likes x.” If we let S(x) mean “x is a sore loser,” then the whole 
statement would be written Vx(S(x) > 7-dyL(y, x)). 


You have probably realized by now that it is usually easiest to 
translate from English into symbols in several steps, translating only a 
little bit at a time. Here are the steps we might use to translate this 
statement: 


Vx(if x has a friend who has the measles then x will have to be 
quarantined). 
Vx[dy(y is a friend of x and y has the measles) > x will have to be 
quarantined]. 


Now, letting F(y, x) stand for “y is a friend of x,” M(y) for “y has the 
measles,” and Q(x) for “x will have to be quarantined,” we get: 


Vxldy(F, x) A M(y)) > Q(x)]. 


The word anyone is difficult to interpret, because in different 
statements it means different things. In statement 3 it meant everyone, 
but in this statement it means someone. Here are the steps of our 
analysis: 


(Someone in the dorm has a friend who has the measles) > 
(everyone in the dorm will have to be quarantined). 

4x(x lives in the dorm and x has a friend who has the measles) > 
Vz(if z lives in the dorm then z will have to be quarantined). 


Using the same abbreviations as in the last statement and letting D(x) 
stand for “x lives in the dorm,” we end up with the following formula: 


Ax[D(x) A Ay(FUy, x) A M(y))] > Vz(D@) > Q(Z)). 


Clearly the answer will have the form of a conditional statement, (A 
© B) > (A and C \ B are disjoint). We have already written A © B 
symbolically in Example 2.1.2. To say that A and C \ B are disjoint 
means that they have no elements in common, or in other words 
adx(x E A A x © C \ B). Putting this all together, and filling in the 
definition of C \ B, we end up with Vx(x E A > x € B) > 7Adx(x E€ 
ANXECAXEB). 


When a statement contains more than one quantifier it is sometimes 
difficult to figure out what it means and whether it is true or false. It may be 
best in this case to think about the quantifiers one at a time, in order. For 
example, consider the statement Vxdy(x + y = 5), where the universe of 
discourse is the set of all real numbers. Thinking first about just the first 
quantifier expression Vx, we see that the statement means that for every real 
number x, the statement Jy(x + y = 5) is true. We can worry later about 
what dy(x + y = 5) means; thinking about two quantifiers at once is too 
confusing. 


If we want to figure out whether or not the statement dy(x + y = 5) is true 
for every value of x, it might help to try out a few values of x. For example, 
suppose x = 2. Then we must determine whether or not the statement Jy(2 + 
y = 5) is true. Now it’s time to think about the next quantifier, dy. This 
statement says that there is at least one value of y for which the equation 2 + 
y = 5 holds. In other words, the equation 2 + y = 5 has at least one solution. 
Of course, this is true, because the equation has the solution y = 5 — 2 = 3. 
Thus, the statement dy(2 + y = 5) is true. 

Let’s try one more value of x. If x = 7, then we are interested in the 
statement dy(7 + y = 5), which says that the equation 7 + y = 5 has at least 
one solution. Once again, this is true, since the solution is y = 5 — 7 = —2. In 
fact, you have probably realized by now that no matter what value we plug 
in for x, the equation x + y = 5 will always have the solution y = 5 — x, so 
the statement Jy(x + y = 5) will be true. Thus, the original statement Vxdy(x 
+ y= 5) is true. 

On the other hand, the statement dyVx(x + y = 5) means something 
entirely different. This statement means that there is at least one value of y 
for which the statement Vx(x + y = 5) is true. Can we find such a value of y? 
Suppose, for example, we try y = 4. Then we must determine whether or not 
the statement Vx(x + 4 = 5) is true. This statement says that no matter what 
value we plug in for x, the equation x + 4 = 5 holds, and this is clearly false. 
In fact, no value of x other than x = 1 works in this equation. Thus, the 
statement Vx(x + 4 = 5) is false. 

We have seen that when y = 4 the statement Vx(x + y = 5) is false, but 
maybe some other value of y will work. Remember, we are trying to 
determine whether or not there is at least one value of y that works. Let’s 
try one more, say, y = 9. Then we must consider the statement Vx(x + 9 = 


5), which says that no matter what x is, the equation x + 9 = 5 holds. Once 
again this is clearly false, since only x = —4 works in this equation. In fact, 
it should be clear by now that no matter what value we plug in for y, the 
equation x + y = 5 will be true for only one value of x, namely x = 5 — y, so 
the statement Vx(x + y = 5) will be false. Thus there are no values of y for 
which Vx(x + y = 5) is true, so the statement dyVx(x + y = 5) is false. 


Notice that we found that the statement Vxdy(x + y = 5) is true, but 
dyVx(x + y = 5) is false. Apparently, the order of the quantifiers makes a 
difference! What is responsible for this difference? The first statement says 
that for every real number x, there is a real number y such that x + y = 5. For 
example, when we tried x = 2 we found that y = 3 worked in the equation x 
+ y = 5, and with x = 7, y = —2 worked. Note that for different values of x, 
we had to use different values of y to make the equation come out true. You 
might think of this statement as saying that for each real number x there is a 
corresponding real number y such that x + y = 5. On the other hand, when 
we were analyzing the statement dyVx(x + y = 5) we found ourselves 
searching for a single value of y that made the equation x + y = 5 true for all 
values of x, and this turned out to be impossible. For each value of x there is 
a corresponding value of y that makes the equation true, but no single value 
of y works for every x. 

For another example, consider the statement VxdyL(x, y), where the 
universe of discourse is the set of all people and L(x, y) means “x likes y.” 
This statement says that for every person x, the statement JyL(x, y) is true. 
Now dyL(x, y) could be written as “x likes someone,” so the original 
statement means that for every person x, x likes someone. In other words, 
everyone likes someone. On the other hand, SyVxL(x, y) means that there is 
some person y such that VxL(x, y) is true. As we saw in Example 2.1.1, 
VxL(x, y) means “Everyone likes y,” so JyVxL(x, y) means that there is 
some person y such that everyone likes y. In other words, there is someone 
who is universally liked. These statements don’t mean the same thing. It 
might be the case that everyone likes someone, but no one is universally 
liked. 


Example 2.1.4. What do the following statements mean? Are they true or 
false? The universe of discourse in each case is N, the set of all natural 


numbers. 


6. 


et) ae ee 


Vxdy(x < y). 
AyVx(x < y). 
AxVy(x < y). 
VyAx(x < y). 
Axdy(x < y). 
VxVy(x < y). 


Solutions 


1. 


This means that for every natural number x, the statement Jy(x < y) is 
true. In other words, for every natural number x, there is a natural 
number bigger than x. This is true. For example, x + 1 is always 
bigger than x. 


This means that there is some natural number y such that the 
statement Vx(x < y) is true. In other words, there is some natural 
number y such that all natural numbers are smaller than y. This is 
false. No matter what natural number y we pick, there will always be 
larger natural numbers. 


This means that there is a natural number x such that the statement 
Vy(x < y) is true. You might be tempted to say that this statement will 
be true if x = 0, but this isn’t right. Since 0 is the smallest natural 
number, the statement 0 < y is true for all values of y except y = 0, but 
if y = 0, then the statement 0 < y is false, and therefore Vy(0 < y) is 
false. Similar reasoning shows that for every value of x the statement 
Vy(x < y) is false, so AxVy(x < y) is false. 


This means that for every natural number y, there is a natural number 
smaller than y. This is true for every natural number y except y = 0, 
but there is no natural number smaller than 0. Therefore this statement 
is false. 


This means that there is a natural number x such that Jy(x < y) is true. 
But as we saw in the first statement, this is actually true for every 
natural number x, so it is certainly true for at least one. Thus, SxAy(x 
< y) is true. 


6. 


This means that for every natural number x, the statement Vy(x < y) is 
true. But as we saw in the third statement, there isn’t even one value 
of x for which this statement is true. Thus, VxVy(x < y) is false. 


Exercises 


*1 


(a) 
(b) 


(c) 
(d) 
(e) 
2; 


(a) 


Analyze the logical forms of the following statements. 


Anyone who has forgiven at least one person is a saint. 

Nobody in the calculus class is smarter than everybody in the discrete 
math class. 

Everyone likes Mary, except Mary herself. 

Jane saw a police officer, and Roger saw one too. 

Jane saw a police officer, and Roger saw him too. 

Analyze the logical forms of the following statements. 

Anyone who has bought a Rolls Royce with cash must have a rich 
uncle. 

If anyone in the dorm has the measles, then everyone who has a 
friend in the dorm will have to be quarantined. 

If nobody failed the test, then everybody who got an A will tutor 
someone who got a D. 

If anyone can do it, Jones can. 

If Jones can do it, anyone can. 


Analyze the logical forms of the following statements. The universe 
of discourse is R. What are the free variables in each statement? 


Every number that is larger than x is larger than y. 

For every number a, the equation ax? + 4x — 2 = 0 has at least one 
solution iff a > -2. 

All solutions of the inequality x? — 3x < 3 are smaller than 10. 

If there is a number x such that x* + 5x = w and there is a number y 
such that 4 — y? = w, then w is strictly between —10 and 10. 


Translate the following statements into idiomatic English. 


Vx[(H(x) A 7~dyM(x, y)) > U(x)], where H(x) means “x is a man,” 
M(x, y) means “x is married to y,” and U(x) means “x is unhappy.” 


2.2 


Jz(P(z, x) A S(z, y) A W(y)), where P(z, x) means “z is a parent of x,” 
S(z, y) means “z and y are siblings,” and W(y) means “y is a woman.” 
Translate the following statements into idiomatic mathematical 
English. 

Vx[(P(x) A 7(x= 2)) > O(x)], where P(x) means “x is a prime 
number” and O(x) means “x is odd.” 

4x[P(x) A Vy(P(y) > y < x)], where P(x) means “x is a perfect 
number.” 

Translate the following statements into idiomatic mathematical 
English. Are they true or false? The universe of discourse is R. 


sAx(x? + 2x+3=0A xX +2x-3=0). 

—[5x(x? + 2x + 3 = 0) A Ax(x? + 2x - 3 = 0)]. 

4Ax(x? + 2x + 3 = 0) A Adx(x? + 2x - 3 = 0). 

Are these statements true or false? The universe of discourse is the set 
of all people, and P(x, y) means “x is a parent of y.” 

AxVyP(x, y). 

VxdyP(x, y). 

=A xAyP(x, y). 

Ax-AyP(x, y). 

Axdy-P(x, y). 

Are these statements true or false? The universe of discourse is N. 


Vxdy(2x - y = 0). 

dyVx(2x - y = 0). 

Vxdy(x - 2y = 0). 

Vx(x <10 > Vy(y <x > y < 9)). 
dys z(y + z = 100). 

Vxdy(y > x A Az(y + z = 100)). 


Same as exercise 8 but with R as the universe of discourse. 


Same as exercise 8 but with Z as the universe of discourse. 


Equivalences Involving Quantifiers 


In our study of logical connectives in Chapter 1 we found it useful to 
examine equivalences between different formulas. In this section, we will 
see that there are also a number of important equivalences involving 
quantifiers. 

For example, in Example 2.1.2 we represented the statement “Nobody’s 
perfect” by the formula ~3xP(x), where P(x) meant “x is perfect.” But 
another way to express the same idea would be to say that everyone fails to 
be perfect, or in other words Vx-P(x). This suggests that these two formulas 
are equivalent, and a little thought should show that they are. No matter 
what P(x) stands for, the formula ~4xP(x) means that there’s no value of x 
in the universe of discourse for which P(x) is true. But that’s the same as 
saying that for every value of x in the universe, P(x) is false, or in other 
words Vx-P(x). Thus, ~3xP(x) is equivalent to Vx-=P(x). 

Similar reasoning shows that =VxP(x) is equivalent to 4x-P(x). To say 
that ~VxP(x) means that it is not the case that for all values of x, P(x) is 
true. That’s equivalent to saying there’s at least one value of x for which 
P(x) is false, which is what it means to say 4x-P(x). For example, in 
Example 2.1.2 we translated “Someone didn’t do the homework” as 
4x5H(x), where H(x) stands for “x did the homework.” An equivalent 
statement would be “Not everyone did the homework,” which would be 
represented by the formula ~VxH(x). 

Thus, we have the following two laws involving negation and quantifiers: 


Quantifier Negation laws 


—Ax P(x) is equivalent to Vx P(x). 
=Vx P(x) is equivalent to Ix- P(x). 


Combining these laws with De Morgan’s laws and other equivalences 
involving the logical connectives, we can often reexpress a negative 
statement as an equivalent, but easier to understand, positive statement. 
This will turn out to be an important skill when we begin to work with 
negative statements in proofs. 


Example 2.2.1. Negate these statements and then reexpress the results as 
equivalent positive statements. 


1. ACB. 


2. Everyone has a relative he or she doesn’t like. 
Solutions 


1. We already know that A © B means Vx(x E A > x €E B). To 
reexpress the negation of this statement as an equivalent positive 
statement, we reason as follows: 

—Vx(x E€ A > x € B) 
is equivalent to J3x—=(x € A —> x € B) (quantifier negation law), 
which is equivalent to 3x—=(x ¢ A Vx € B) (conditional law), 
which is equivalent to 3x(x € A A x ¢ B) (De Morgan’s law). 
Thus, A B means the same thing as dx(x E A A x €& B). If you 


think about this, it should make sense. To say that A is not a subset of 
B is the same as saying that there’s something in A that is not in B. 


2. First of all, let’s write the original statement symbolically. You should 
be able to check that if we let R(x, y) stand for “x is related to y” and 
L(x, y) for “x likes y,” then the original statement would be written 
Vxdy(R(x, y) A -=L(x,y)). Now we negate this and try to find a 
simpler, equivalent positive statement: 

—Vxdy(R(x. y) A >L(x.y)) 
is equivalent to dx-dy( R(x, y) A mL (x, y)) 

(quantifier negation law), 
which is equivalent to IxVy—( R(x, y) A aL (x, y)) 

(quantifier negation law), 
which is equivalent to IxVy(-R(x, y) v L(x. y)) 

(De Morgan’s law), 
which is equivalent to dxVy(R(x, y) > L(x, y)) 


(conditional law). 


Let’s translate this last formula back into colloquial English. Leaving 
aside the first quantifier for the moment, the formula Vy(R(x, y) > 
L(x, y)) means that for every person y, if x is related to y then x likes 
y. In other words, x likes all his or her relatives. Adding 4x to the 


beginning of this, we get the statement “There is someone who likes 
all his or her relatives.” You should take a minute to convince 
yourself that this really is equivalent to the negation of the original 
statement “Everyone has a relative he or she doesn’t like.” 


For another example of how the quantifier negation laws can help us 
understand statements, consider the statement “Everyone who Patricia likes, 
Sue doesn’t like.” If we let L(x, y) stand for “x likes y,” and we let p stand 
for Patricia and s for Sue, then this statement would be represented by the 
formula Vx(L(p, x) > =L(s,x)). Now we can work out a formula equivalent 
to this one as follows: 


Vx(L(p,x) > -L(s,x)) 
is equivalent to Vx(—L(p,x) V ~L(s,x)) (conditional law), 
which is equivalent to Vx-(L(p,x) A L(s,x)) (De Morgan’s law), 


which is equivalent to ~Ax(L(p,x) A L(s,x)) (quantifier negation law). 


Translating the last formula back into English, we get the statement 
“There’s no one who both Patricia and Sue like,” and this does mean the 
same thing as the statement we started with. 

We saw in Section 2.1 that reversing the order of two quantifiers can 
sometimes change the meaning of a formula. However, if the quantifiers are 
the same type (both V or both J), it turns out the order can always be 
switched without affecting the meaning of the formula. For example, 
consider the statement “Someone has a teacher who is younger than he or 
she is.” To write this symbolically we first write 4x(x has a teacher who is 
younger than x). Now to say “x has a teacher who is younger than x” we 
write dy(T(y, x) A P(Y, x)), where T(y, x) means “y is a teacher of x” and 
P(y, x) means “y is younger than x.” Putting this all together, the original 
statement would be represented by the formula 3x3y(T(y, x) A P(y, x)). 


Now what happens if we switch the quantifiers? In other words, what 
does the formula dydx(T(y, x) A P(y, x)) mean? You should be able to 
convince yourself that this formula says that there is a person y such that y 
is a teacher of someone who is older than y. In other words, someone has a 
student who is older than he or she is. But this would be true in exactly the 
Same circumstances as the original statement, “Someone has a teacher who 
is younger than he or she is”! Both mean that there are people x and y such 


that y is a teacher of x and y is younger than x. In fact, this suggests that a 
good way of reading the pair of quantifiers Jydx or dxdy would be “there 
are objects x and y such that ....” 

Similarly, two universal quantifiers in a row can always be switched 
without changing the meaning of a formula, because VxVy and VyVx can 
both be thought of as meaning “for all objects x and y, ....” For example, 
consider the formula VxVy(L(x, y) > A(x, y)), where L(x, y) means “x likes 
y” and A(x, y) means “x admires y.” You could think of this formula as 
saying “For all people x and y, if x likes y then x admires y.” In other words, 
people always admire the people they like. The formula VyVx(L(x, y) > 
A(x, y)) means exactly the same thing. 

It is important to realize that when we say “there are objects x and y” or 
“for all objects x and y,” we are not ruling out the possibility that x and y are 
the same object. For example, the formula VxVy(L(x, y) > A(x, y)) means 
not just that a person who likes another person always admires that other 
person, but also that people who like themselves also admire themselves. 
As another example, suppose we wanted to write a formula that means “x is 
a bigamist.” (Of course, x will be a free variable in this formula.) You might 
think you could express this with the formula dydz(M(x, y) A M(x, z)), 
where M(x, y) means “x is married to y.” But to say that x is a bigamist you 
must say that there are two different people to whom x is married, and this 
formula doesn’t say that y and z are different. The right answer is 
dydz(M(x, y) A M(x, z) Ny 42). 


Example 2.2.2. Analyze the logical forms of the following statements. 


1. All married couples have fights. 
2. Everyone likes at least two people. 


3. John likes exactly one person. 
Solutions 


1. VxVy(M(x, y) > F(x, y)), where M(x, y) means “x and y are married 
to each other” and F(x, y) means “x and y fight with each other.” 


2. Wxdydz(L(x, y) A L(x, Z) A y #z), where L(x, y) stands for “x likes y.” 
Note that the statement means that everyone likes at least two 


different people, so it would be incorrect to leave out the “y # z” at the 
end. 


3. Let L(x, y) mean “x likes y,” and let j stand for John. We translate this 
Statement into symbols gradually: 


(i) Ax(John likes x and John doesn’t like anyone other than x). 
(ii) Ax(LG, x) A ~dy(John likes y and y # x)). 
(iii) Ax(LG, x) A ~Ay(LG, y) A y # x). 


Note that for the third statement in this example we could not have given 
the simpler answer AxL(j, x), because this would mean that John likes at 
least one person, not exactly one person. The phrase exactly one occurs so 
often in mathematics that there is a special notation for it. We will write A! 
xP(x) to represent the statement “There is exactly one value of x such that 
P(x) is true.” It is sometimes also read “There is a unique x such that P(x).” 
For example, the third statement in Example 2.2.2 could be written 
symbolically as 4! xLG, x). In fact, we could think of this as just an 
abbreviation for the formula given in Example 2.2.2 as the answer for 
statement 3. Similarly, in general we can think of J! xP(x) as an 
abbreviation for the formula 4x(P(x) A =Ay(P(y) A y # x)). 

Recall that when we were discussing set theory, we sometimes found it 
useful to write the truth set of P(x) as {x © U | P(x)} rather than {x | P(x)}, 
to make sure it was clear what the universe of discourse was. Similarly, 
instead of writing VxP(x) to indicate that P(x) is true for every value of x in 
some universe U, we might write Vx € U P(x). This is read “For all x in U, 
P(x).” Similarly, we can write 3x © U P(x) to say that there is at least one 
value of x in the universe U such that P(x) is true. For example, the 
statement Vx(x > 0) would be false if the universe of discourse were the real 
numbers, but true if it were the natural numbers. We could avoid confusion 
when discussing this statement by writing either Vx E R(x = 0) or Vx € 


N(x = 0), to make it clear which we meant. 


As before, we sometimes use this notation not to specify the universe of 
discourse but to restrict attention to a subset of the universe. For example, if 
our universe of discourse is the real numbers and we want to say that some 
real number x has a square root, we could write Jy(y* = x). To say that 
every positive real number has a square root, we would say Vx E R*Ay(y" 


= x). We could say that every positive real number has a negative square 
root by writing Vx E€ R*dy € R (y* = x). In general, for any set A, the 
formula Vx € A P(x) means that for every value of x in the set A, P(x) is 
true, and dx € A P(x) means that there is at least one value of x in the set A 
such that P(x) is true. The quantifiers in these formulas are sometimes 
called bounded quantifiers, because they place bounds on which values of x 
are to be considered. Occasionally we may use variations on this notation to 
place other kinds of restrictions on quantified variables. For example, the 
statement that every positive real number has a negative square root could 
also be written Vx > Ody < O(y* = x). 

Formulas containing bounded quantifiers can also be thought of as 
abbreviations for more complicated formulas containing only normal, 
unbounded quantifiers. To say that dx E€ A P(x) means that there is some 
value of x that is in A and that also makes P(x) come out true, and another 
way to write this would be Ax(x E A A P(x)). Similarly, you should 
convince yourself that Vx © A P(x) means the same thing as Vx(x E A > 
P(x)). For example, the formula Vx € R*Ay E€ R (y* = x) discussed earlier 


means the same thing as Vx(x E R* > Ay E R (y* = x)), which in turn can 
be expanded as Vx(x E R* > Jy(y E€ R A y? = x)). By the definitions of 
R* and R7, an equivalent way to say this would be Vx(x > 0 > Ay(y< OA 


y? = x)). You should make sure you are convinced that this formula, like the 
original formula, means that every positive real number has a negative 
square root. For another example, note that the statement A © B, which by 
definition means Vx(x E A > x € B), could also be written as Vx E A(x © 
B). 

It is interesting to note that the quantifier negation laws work for 
bounded quantifiers as well. In fact, we can derive these bounded quantifier 
negation laws from the original laws by thinking of the bounded quantifiers 
as abbreviations, as described earlier. For example, 


“Vx € A P(x) 

is equivalent to ~Wx(x € A — P(x)) (expanding abbreviation), 
which is equivalent to dx—(x € A — P(x)) (quantifier negation law), 
which is equivalent to dx-(x ¢ AV P(x)) (conditional law), 
which is equivalent to dx(x € AA—P(x)) (De Morgan’s law), 


which is equivalent to 3x € A =P (x) (abbreviation). 


Thus, we have shown that ~Vx € A P(x) is equivalent to dx E A ~P(x). 
You are asked in exercise 5 to prove the other bounded quantifier negation 
law, that ~x € A P(x) is equivalent to Vx E A =P(x). 

It should be clear that if A = © then dx € A P(x) will be false no matter 
what the statement P(x) is. There can be nothing in A that, when plugged in 
for x, makes P(x) come out true, because there is nothing in A at all! It may 
not be so clear whether Vx € A P(x) should be considered true or false, but 
we can find the answer using the quantifier negation laws: 


Vx € A P(x) 
is equivalent to ~~Vx € A P(x) (double negation law), 


which is equivalent to ~dx € A—P(x) (quantifier negation law). 


Now if A = © then this last formula will be true, no matter what the 
statement P(x) is, because, as we have seen, 3 x E A a P(x) must be false. 
Thus, Vx € A P(x) is always true if A = ©. Mathematicians sometimes say 
that such a statement is vacuously true. Another way to see this is to rewrite 
the statement Vx E A P(x) in the equivalent form Vx(x E A > P(x)). Now 
according to the truth table for the conditional connective, the only way this 
can be false is if there is some value of x such that x € A is true but P(x) is 
false. But there is no such value of x, simply because there isn’t a value of x 
for which x € A is true. 

As an application of this principle, we note that the empty set is a subset 
of every set. To see why, just rewrite the statement A © B in the equivalent 
form Vx € A(x € B). Now if A = © then, as we have just observed, this 
statement will be vacuously true. Thus, no matter what the set B is, Ø © B. 
Another example of a vacuously true statement is the statement “All 
unicorns are purple.” We could represent this by the formula Vx € A P(x), 
where A is the set of all unicorns and P(x) stands for “x is purple.” Since 


there are no unicorns, A is the empty set, so the statement is vacuously true. 
(Notice that the statement “All unicorns are green” is also true — which does 
not contradict the fact that all unicorns are purple!) 

Perhaps you have noticed by now that, although in Chapter 1 we were 
always able to check equivalences involving logical connectives by making 
truth tables, we have no such simple way of checking equivalences 
involving quantifiers. So far, we have justified our equivalences involving 
quantifiers by just looking at examples and using common sense. As the 
formulas we work with get more complicated, this method will become 
unreliable and difficult to use. Fortunately, in Chapter 3 we will develop 
better methods for reasoning about statements involving quantifiers. To get 
more practice in thinking about quantifiers, we will work out a few 
somewhat more complicated equivalences using common sense. If you’re 
not completely convinced that these equivalences are right, you’ll be able to 
check them more carefully when you get to Chapter 3. 

Consider the statement “Everyone is bright-eyed and bushy-tailed.” If we 
let E(x) mean “x is bright-eyed” and T(x) mean “x is bushy-tailed,” then we 
could represent this statement by the formula Vx(E(x) A T(x)). Is this 
equivalent to the formula VxE(x) A WxT(x)? This latter formula means 
“Everyone is bright-eyed, and also everyone is bushy-tailed,” and 
intuitively this means the same thing as the original statement. Thus, it 
appears that Vx(E(x) A T(x)) is equivalent to V xE(x) A WxT(x). In other 
words, we could say that the universal quantifier distributes over 
conjunction. 

However, the corresponding distributive law doesn’t work for the 
existential quantifier. Consider the formulas 4x(E(x) A T(x)) and JxE(x) 
AAxT(x). The first means that there is someone who is both bright-eyed and 
bushy-tailed, and the second means that there is someone who is bright- 
eyed, and there is also someone who is bushy-tailed. These don’t mean the 
same thing at all. In the second statement the bright-eyed person and the 
bushy-tailed person don’t have to be the same, but in the first statement they 
do. Another way to see the difference between the two statements is to think 
about truth sets. Let A be the truth set of E(x) and B the truth set of T(x). In 
other words, A is the set of bright-eyed people, and B is the set of bushy- 
tailed people. Then the second statement says that neither A nor B is the 


empty set, but the first says that A n B is not the empty set, or in other 
words that A and B are not disjoint. 

As an application of the distributive law for the universal quantifier and 
conjunction, suppose A and B are sets and consider the equation A = B. We 
know that two sets are equal when they have exactly the same elements. 
Thus, the equation A = B means Vx(x E A e x €E B), which is equivalent to 
Vx[(x EA > x © B) A (x EB > x € A)]. Because the universal quantifier 
distributes over conjunction, this is equivalent to the formula Vx(x E A > x 
€E B) A Vx(x E B > x € A), and by the definition of subset this means A © 
B A B & A. Thus, we have shown that the equation A = B is also equivalent 
to the formulaA © BA BCA. 


We have now introduced seven basic logical symbols: the connectives A, 
V,-7, >, and «, and the quantifiers V and d. It is a remarkable fact that the 
structure of all mathematical statements can be understood using these 
symbols, and all mathematical reasoning can be analyzed in terms of the 
proper use of these symbols. To illustrate the power of the symbols we have 
introduced, we conclude this section by writing out a few more 
mathematical statements in logical notation. 


Example 2.2.3. Analyze the logical forms of the following statements. 


1. Statements about the natural numbers. The universe of discourse is N. 


(a) xis a perfect square. 

(b) xis a multiple of y. 

(c) xis prime. 

(d) xis the smallest positive number that is a multiple of both y and z. 


2. Statements about the real numbers. The universe of discourse is R. 
(a) The identity element for addition is 0. 
(b) Every real number has an additive inverse. 


(c) Negative numbers don’t have square roots. 
(d) Every positive number has exactly two square roots. 


Solutions 


1. (a) This means that x is the square of some natural number, or in 
other words Jy(x = y°). 


(b) 
(c) 


(d) 


This means that x is equal to y times some natural number, or in other 
words Az(x = yz). 

This means that x > 1, and x cannot be written as a product of two 
smaller natural numbers. In symbols: x > 1 A 7 dydz(x = yz Ay < x 
Az<xX). 

We translate this in several steps: 


(i) x is positive and x is a multiple of both y and z and there is no 


smaller positive number that is a multiple of both y and z. 


(ii) x >0 A da(x = ya) A Ab(x = zb) A ~AW(w >0 Aw <x A (wisa 


multiple of both y and z)). 


(iii) x > 0 A da(x= ya) A Ab(x = zb) A ~Aw(w > 0 AwW<x A dAc(w= 


2 
(b) 
(c) 
(d) 


yc) A Jd(w = zd)). 
(a) Vx(x + 0 =x). 
Vxdy(x + y = 0). 
Vx(x <0 > 7dy(y? = x)). 
We translate this gradually: 


(i) Vx(x >0 > x has exactly two square roots). 
(ii) Vx(x >0 > Aydz(y and z are square roots of x and y # z and 


nothing else is a square root of x)). 


(iii) Vx(x > 0 > Aydz(y*=xAz2=xANy4#zN7dWwWe=axAweyn 


w # Z))). 


Exercises 


ol, 


(a) 


Negate these statements and then reexpress the results as equivalent 
positive statements. (See Example 2.2.1.) 


Everyone who is majoring in math has a friend who needs help with 
his or her homework. 

Everyone has a roommate who dislikes everyone. 

AUBEC\D. 

AxVyly > x > Az(z* + 5z=y)]. 


Negate these statements and then reexpress the results as equivalent 
positive statements. (See Example 2.2.1.) 


*8. 


*11. 


There is someone in the freshman class who doesn’t have a room- 
mate. 

Everyone likes someone, but no one likes everyone. 

Va E AAbDE BAC C+ bE C). 

Vy > OAx(ax? + bx +c = y). 

Are these statements true or false? The universe of discourse is N. 


Vx(x < 7 > dadbdc(a* +b? +c? = x)). 

J! x(x? + 3 = 4x). 

J! x(x? = 4x + 5). 

gJxJy(x? = 4x + 5 A y? = 4y + 5). 

Show that the second quantifier negation law, which says that 
=VxP(x) is equivalent to 4x-P(x), can be derived from the first, 
which says that ~4xP(x) is equivalent to Vx-P(x). (Hint: Use the 
double negation law.) 


Show that ~dx € A P(x) is equivalent to Vx E A =P(x). 

Show that the existential quantifier distributes over disjunction. In 
other words, show that 4x(P(x) V Q(x)) is equivalent to 4xP(x) V 
4xQ(x). (Hint: Use the fact, discussed in this section, that the 
universal quantifier distributes over conjunction.) 


Show that 4x(P(x) > Q(x)) is equivalent to VxP(x) > IxQ(x). 

Show that (Vx E A P(x)) A (Vx € B P(x)) is equivalent to Vx € (A U 
B) P(x). (Hint: Start by writing out the meanings of the bounded 
quantifiers in terms of unbounded quantifiers.) 

Is Vx(P(x) V Q(x)) equivalent to VxP(x) V VxQ(x)? Explain. (Hint: 
Try assigning meanings to P(x) and Q(x).) 

(a) Show that 3x € A P(x) v Ax € B P(x) is equivalent to dx € (A 
U B)P(x). 

Is dx E A P(x) A Ax E B P(x) equivalent to dx E€ (A N B) P(x)? 
Explain. 

Show that the statements A © B and A \ B = © are equivalent by 
writing each in logical symbols and then showing that the resulting 
formulas are equivalent. 


12. Show that the statements C © A U B and C\A © B are equivalent by 
writing each in logical symbols and then showing that the resulting 
formulas are equivalent. 

13. (a) Show that the statements A © B and A U B = Bare equivalent by 
writing each in logical symbols and then showing that the 
resulting formulas are equivalent. (Hint: You may find exercise 
11 from Section 1.5 useful.) 


(b) Show that the statements A © B and A N B= A are equivalent. 


*14. Show that the statements A n B= Ø and A \ B= A are equivalent. 

15. Let T(x, y) mean “x is a teacher of y.” What do the following 
statements mean? Under what circumstances would each one be true? 
Are any of them equivalent to each other? 

(a) A! yT( y). 

(b) axd! yT(x, y). 

(c) A! xdyT(x, y). 

(d) Aya! xT(x, y). 

(e) A! xd! yT(x, y). 

(f) AxAy[T(x, y) A ~dudv(T (u, v) A (u žx V v žy))]. 


2.3 More Operations on Sets 


Now that we know how to work with quantifiers, we are ready to discuss 
some more advanced topics in set theory. 

So far, the only way we have to define sets, other than listing their 
elements one by one, is to use the elementhood test notation {x | P(x)}. 
Sometimes this notation is modified by allowing the x before the vertical 
line to be replaced with a more complex expression. For example, suppose 
we wanted to define S to be the set of all perfect squares. Perhaps the 
easiest way to describe this set is to say that it consists of all numbers of the 
form n?, where n is a natural number. This is written S = {n° | n E N}. Note 


that, using our solution for the first statement from Example 2.2.3, we could 
also define this set by writing S = {x | In E N(x = n*)}. Thus, {n? | n E N} 


= {x | dn E N(x = nô} and therefore x E {n* | n E N} means the same 
thing as dn E N(x = nô). 

Similar notation is often used if the elements of a set have been 
numbered. For example, suppose we wanted to form the set whose elements 
are the first 100 prime numbers. We might start by numbering the prime 
numbers, calling them py, p>, P3, .... In other words, p4 = 2, p> = 3, p3 = 5, 
and so on. Then the set we are looking for would be the set P = {p}, po, P3, . 

- »Pyo9}- Another way of describing this set would be to say that it consists 
of all numbers p,, for i an element of the set I = {1, 2,3,..., 100} = {i E N 
| 1 <i< 100}. This could be written P = {p; | i E I}. Each element p; in this 
set is identified by a number i € I, called the index of the element. A set 


defined in this way is sometimes called an indexed family, and I is called 
the index set. 


Although the indices for an indexed family are often numbers, they need 
not be. For example, suppose S is the set of all students at your school. If 
we wanted to form the set of all mothers of students, we might let m, stand 


for the mother of s, for any student s. Then the set of all mothers of students 
could be written M = {m, | s E S}. This is an indexed family in which the 
index set is S, the set of all students. Each mother in the set is identified by 
naming the student who is her child. Note that we could also define this set 
using an elementhood test, by writing M = {m | is the mother of some 
student} = {m | ds E S(m = m,)}. In general, any indexed family A= {x; | i 
€ I} can also be defined as A = {x| di E I(x = x,)}. It follows that the 
statement x € {x; |i E I} means the same thing as Ji E I(x = x;). 


Example 2.3.1. Analyze the logical forms of the following statements by 
writing out the definitions of the set theory notation used. 


1. ve{j/x|x eQ}. 
2, {xli ESA. 
3. {n? |n E N} and {n° | n € N} are not disjoint. 


Solutions 


2. By the definition of subset we must say that every element of {x; | i E 
I} is also an element of A, so we could start by writing Vx(x E {x; | i 
€ I} > x € A). Filling in the meaning of x E {x; | i E I}, which we 
worked out earlier, we would end up with Vx(di E I(x = x) > x © 
A). But since the elements of {x; | i © I} are just the x; ‘s, for all i E J, 
perhaps an easier way of saying that every element of {x; | i E I} is an 
element of A would be Vi € I (x; E A). The two answers we have 
given are equivalent, but showing this would require the methods we 
will be studying in Chapter 3. 

3. We must say that the two sets have a common element, so one 
solution is to start by writing Ix(x E {n |n EN} Ax € {3 |n€ 
N}). However, as in the last statement, there is an easier way. An 


element common to the two sets would have to be the square of some 
natural number and also the cube of some (possibly different) natural 
number. Thus, we could say that there is such a common element by 
saying Jn E Nam E N(n? = m°). Note that it would be wrong to 


write dn E N(n? = n°), because this wouldn’t allow for the possibility 


of the two natural numbers being different. By the way, this statement 
is true, since 64 = 82 = 42, so 64 is an element of both sets. 


Anything at all can be an element of a set. Some interesting and useful 
ideas arise when we consider the possibility of a set having other sets as 
elements. For example, suppose A = {1, 2, 3}, B= {4}, and C = ©. There is 
no reason why we couldn’t form the set F = {A, B, C}, whose elements are 
the three sets A, B, and C. Filling in the definitions of A, B, and C, we could 
write this in another way: F = {{1, 2, 3}, {4}, Ø}. Note that 1 E A and A 


E F but 1 É F. F has only three elements, and all three of them are sets, 
not numbers. Sets such as F, whose elements are all sets, are sometimes 


called families of sets. 


It is often convenient to define families of sets as indexed families. For 
example, suppose we again let S stand for the set of all students, and for 


each student s we let C, be the set of courses that s has taken. Then the 
collection of all of these sets C, would be an indexed family of sets F = {C, 
| s E S}. Remember that the elements of this family are not courses but sets 


of courses. If we let t stand for some particular student Tina, and if Tina has 
taken Calculus, English Composition, and American History, then C, = 


{Calculus, English Composition, American History} and C, E F, but 
Calculus € F. 


An important example of a family of sets is given by the power set of a 
set. 


Definition 2.3.2. Suppose A is a set. The power set of A, denoted (A), is 
the set whose elements are all the subsets of A. In other words, 


P(A) {x| x S A}. 


For example, the set A = {7, 12} has four subsets: ©, {7}, {12}, and {7, 
12}. Thus, A(A) = {Ø, {7}, {12}, {7, 12}}. What about A(O)? Although 
© has no elements, it does have one subset, namely ©. Thus, AO) = 10}. 
Note that, as we saw in Section 1.3, {©} is not the same as Ø. 

Any time you are working with some subsets of a set X, it may be helpful 
to remember that all of these subsets of X are elements of A(X), by the 
definition of power set. For example, if we let C be the set of all courses 
offered at your school, then each of the sets C, discussed earlier is a subset 


of C. Thus, for each student s, C, E A(C). This means that every element 
of the family F = {C, | s E S} is an element of P(C), so FS AC). 


Example 2.3.3. Analyze the logical forms of the following statements. 
1. xE MA). 
2. L(A) E AB). 
3. BE {AA)|A€ F}. 
4. xE AAN B). 


5. XE AMA) AB). 
Solutions 
1. By the definition of power set, the elements of (A) are the subsets 


of A. Thus, to say that x E (A) means that x S A, which we already 
know can be written as Vy(y E x > y E A). 


By the definition of subset, this means Vx(x E A(A) > x E A(B)). 
Now, writing out x E A(A) and x E AB) as before, we get Vx[Vy(y 
Ex > yEA) - Vyy Ex- yEB). 

As before, this means JA E F(B = “A)). Now, to say that B = AA) 


means that the elements of B are precisely the subsets of A, or in other 
words Vx(x E B e x G A). Filling this in, and writing out the 
definition of subset, we get our final answer, JA E F Vx(x E B o 


Vy(y Ex > y EA)). 

As in the first statement, we start by writing this as Vy(y E x > y E 
A N B). Now, filling in the definition of intersection, we get Vy(y € x 
>-(yvEANyEB)). 

By the definition of intersection, this means (x E IA) A (x €E 
Y(B)). Now, writing out the definition of power set as before, we get 
Vy(y Ex > y EA) A VWyy Ex > y EB). 


Note that for statement 5 in this example we first wrote out the definition 
of intersection and then used the definition of power set, whereas in 
statement 4 we started by writing out the definition of power set and then 
used the definition of intersection. As you learn the definitions of more 
mathematical terms and symbols, it will become more important to be able 
to choose which definition to think about first when working out the 
meaning of a complex mathematical statement. A good rule of thumb is to 
always start with the “outermost” symbol. In statement 4 in Example 2.3.3, 
the intersection symbol occurred inside the power set notation, so we wrote 
out the definition of power set first. In statement 5, the power set notation 
occurred within both sides of the notation for the intersection of two sets, so 


we started with the definition of intersection. Similar considerations led us 
to use the definition of subset first, rather than power set, in statement 2. 


It is interesting to note that our answers for statements 4 and 5 in 
Example 2.3.3 are equivalent. (You are asked to verify this in exercise 11.) 
As in Section 1.4, it follows that for any sets A and B, AAA n B) = P(A) n 
P(B). You are asked in exercise 12 to show that this equation is not true in 
general if we change N to U. 

Consider once again the family of sets F = {C, | s E S}, where S is the 


set of all students and for each student s, C is the set of all courses that s 


has taken. If we wanted to know which courses had been taken by all 
students, we would need to find those elements that all the sets in F have in 


common. The set of all these common elements is called the intersection of 
the family F and is written )F. Similarly, the union of the family ẸF, 


written JF, is the set resulting from throwing all the elements of all the sets 
in F together into one set. In this case, UF would be the set of all courses 


that had been taken by any student. 


Example 2.3.4. Let F = {{1, 2, 3, 4}, {2, 3, 4, 5}, {3, 4, 5, 6}}. Find QF 
and UF. 


Solution 
ea = {1,2,3,4} N {2,3,4,5} N (3.4, 5,6} = {3,4}. 


JF = {1,2,3,4} U (2,3, 4,5} U {3,4,5,6} = {1,2,3,4,5, 6). 


Although these examples may make it clear what we mean by NF and 
UF, we still have not given careful definitions for these sets. In general, if 
F is any family of sets, then we want ()F to contain the elements that all 
the sets in F have in common. Thus, to be an element of NF, an object will 
have to be an element of every set in F. On the other hand, anything that is 


an element of any of the sets in F should be in UF, so to be in UF an 


object only needs to be an element of at least one set in F. Thus, we are led 


to the following general definitions. 


Definition 2.3.5. Suppose F is a family of sets. Then the intersection and 


union of F are the sets ()F and UF defined as follows: 


(\F = b | VA e F(x € A)} = {x | VA(A e F > x € AD}. 
JF = {x | 3A e F(x € A)} = {x | FA(A EF Ax € A)}. 


Some mathematicians consider )F to be undefined if F = Ø. For an 


explanation of the reason for this, see exercise 15. We will use the notation 
NF only when F # ©. 


Notice that if A and B are any two sets and F = {A, B}, then NF =A Nn B 
and UF =A U B. Thus, the definitions of intersection and union of a family 


of sets are actually generalizations of our old definitions of the intersection 
and union of two sets. 


Example 2.3.6. Analyze the logical forms of the following statements. 


1. xE QF. 
2. NF E UG. 
3. xE MUP). 


4. x CU{LAA)|A E FI. 


Solutions 


1. By the definition of the intersection of a family of sets, this means VA 
E F(x € A), or equivalently, V A(A E F > x E A). 


2. As we saw in Example 2.2.1, to say that one set is not a subset of 
another means that there is something that is an element of the first 
but not the second. Thus, we start by writing x(x E NF ax E UG). 


We have already written out what x E NF means in solution 1. By 
the definition of the union of a family of sets, x E UG means JA €E 
Ax E A), sox € UG means ~JA E G(x € A). By the quantifier 
negation laws, this is equivalent to VA E€ G(x € A). Putting this all 
together, our answer is dax[V A E F(x CA) A VA E G(x EA). 

3. Because the union symbol occurs within the power set notation, we 


start by writing out the definition of power set. As in Example 2.3.3, 
we get x © UF, or in other words Vy(y E x > y UF). Now we use 


the definition of union to write out y E UF as SA €E F (y € A). The 
final answer is Vy(y E x > JA E FQ E A)). 


4. This time we start by writing out the definition of union. According to 
this definition, the statement means that x is an element of at least one 
of the sets (A), for A E F. In other words, JA E A(x E A(A)). 


Inserting our analysis of the statement x E A(A) from Example 2.3.3, 
we get JA E FVy(y E x > y EA). 


Writing complex mathematical statements in logical symbols, as we did 
in the last example, may sometimes help you understand what the 
statements mean and whether they are true or false. For example, suppose 
that we once again let C, be the set of all courses that have been taken by 
student s. Let M be the set of math majors and E the set of English majors, 
and let F = {C, | s E M} and G= {C, | s E E}. With these definitions, what 


does statement 2 of Example 2.3.6 mean, and under what circumstances 
would it be true? According to our solution for this example, the statement 
means Jx[VA E A(x € A) A VA E G(x € A)], or in other words, there is 


something that is an element of each set in F, and that fails to be an element 
of each set in Q. Taking into account the definitions of F and ¢ that we are 


using, this means that there is some course that has been taken by all of the 
math majors but none of the English majors. If, for example, all of the math 


majors have taken Calculus but none of the English majors have, then the 
statement would be true. 


As another example, suppose F = {{1, 2, 3}, {2, 3, 4}, {3, 4, 5}}, and x 


= {4, 5, 6}. With these definitions, would statement 3 of Example 2.3.6 be 
true? You could determine this by finding “(()F) and then checking to see 


if x is an element of it, but this would take a very long time, because it turns 
out that A(()F) has 32 elements. It is easier to use the translation into 


logical symbols given in our solution for this example. According to that 
translation, the statement means Vy(y E x > JA E F(y E A)); in other 


words, every element of x is in at least one set in F. Looking back at our 
definitions of F and x, it is not hard to see that this is false, because 6 €E x, 
but 6 is not in any of the sets in F. 


An alternative notation is sometimes used for the union or intersection of 
an indexed family of sets. Suppose F = {A; | i E I}, where each A, is a set. 


Then UF would be the set of all elements common to all the A,’s, for i E J, 


and this can also be written as {);-, Ai. In other words, 
NF = (Ai = {x | Vi e I(x € A;)}. 
iel 


Similarly, an alternative notation for UF is Uje; Aj, so 


JF = Jai = {x | di e I(x € A;)}. 
iel 


Returning to our example of courses taken by students, we could use this 
notation to write the set of courses taken by all students as (ses C,. You 


could think of this notation as denoting the result of running through all of 
the elements s in S, forming the set C, for each of them, and then 


intersecting all of these sets. 


Example 2.3.7. Let J = {1, 2, 3}, and for each i E I let A; = {i, i + 1,i + 2, 
i+ 3}. Find N;ez Ai and U;_, Ai. 


Solution 
First we list the elements of the sets A,, for i E I: 

Aj = {1,2,3,4}, Aa = (7,3:4,5), A3 = {3,4,5, 6}. 
Then 


(Ai = A1 N A2N 43 = {1,2,3,4} N (2, 3, 4,5} N (3, 4,5, 6} = (3, 4}, 
iel 


and similarly 


Jai = {1,2,3,4} U {2, 3, 4,5} U (3, 4, 5, 6} = (1, 2,3,4,5, 6}. 


In fact, we can now see that the question asked in this example is exactly 
the same as the one in Example 2.3.4, but with different notation. 


Example 2.3.8. For this example our universe of discourse will be the set 
S of all students. Let L(x, y) stand for “x likes y” and A(x, y) for “x admires 
y.” For each student s, let L, be the set of all students that s likes. In other 


words L, = {t E S | L(s, t)}. Similarly, let A, = {t E S | A(s, t)} = the set of 
all students that s admires. Describe the following sets. 


ses Ls- 

Uses Ls- 

Uses Ls \ Uses As. 

Uses(Ls \ As). 

(ses Ls) N (Mses As): 
Oses(Ls N As). 

7. Open Lo. where B = [pes As- 


2 e eo a 


Solutions 


First of all, note that in general, t E L, means the same thing as L(s, t), and 
similarly t E A, means A(s, t). 


Nses Ls = {t | Vs E S(t E LJ} = {t E S | V s E S L(s, t)} = the set of 
all students who are liked by all students. 

Uses L, = {t| ds E S(t E L} = {t E S | ds E S L(s, t)} = the set of 
all students who are liked by at least one student. 


As we saw in solution 2, Uses L, = the set of all students who are 
liked by at least one student. Similarly, Uses A, = the set of all 
students who are admired by at least one student. Thus Uses Ls \ 
Uses As = {t | t E Uses Ls and t € Uses As} = the set of all students 


who are liked by at least one student, but are not admired by any 
students. 

Uses (L; \A,) = {t | ds E S(t E L, \ AD} = {t E S| ds E S (L(s, t) A 
=A(s, t))} = the set of all students t such that some student likes t, but 
doesn’t admire t. Note that this is different from the set in part 3. For a 
student t to be in this set, there must be a student who likes t but 
doesn’t admire t, but there could be other students who admire t. To 
be in the set in part 3, t must be admired by nobody. 

(Meres Ls) N (Mres As) = {t | t E€ Mres Ls andt E€ [ses As} = {t | Ys 
S(t € Ls) AWs € S(t € As)} = {t € S | Ys € SL(s,t) AWs € SA(s,t)} = the set 
of all students who are liked by all students and also admired by all 
students. 


mM 


Oyes(Ls N As) = {t | Ys € S(t € Ls N As)} = {t € S | Ys € S(L(s,t) A 

A(s,t))} = the set of all students who are both liked and admired by 
all students. This is the same as the set in part 5. In fact, you can use 
the distributive law for universal quantification and conjunction to 
show that the elementhood tests for the two sets are equivalent. 


Uber Lb = {t | Ib € Bit € Ly)} = {t € S | Ib(b € B ^ L(b,t))}. But B 
was defined to be the set of all students who are admired by all 
students, so b E B means b E S A Vs E S A(s, b). Inserting this, we 
get Upeg Lo = {t € S | Ib(b € SA Ys € S A(s, b) AL(b,t))} = the set of all 
students who are liked by some student who is admired by all 
students. 


Exercises 


cd 


Analyze the logical forms of the following statements. You may use 
the symbols €E, É, =, 4, A, V, >, ©, V, and J in your answers, but 
not S, É, Z, N, U, \, {,}, or ~. (Thus, you must write out the 
definitions of some set theory notation, and you must use 
equivalences to get rid of any occurrences of ~.) 

FES AA). 

AE& {2n+1|n€& N}. 

{n?+n+1|n€N} © {2n+1|n EN} 

Analyze the logical forms of the following statements. You may use 
the symbols €, É, =, #, A, V, >, +, V, and J in your answers, but 
not S, Æ, Y, N, U, \, {,}, or 7. (Thus, you must write out the 


definitions of some set theory notation, and you must use 
equivalences to get rid of any occurrences of ~.) 


x E UF \ UG. 

{xE B|x EC} E AA). 

x €[)j-;(Ai U Bi). 

x € (Mier åD U (iez Bi)- 

We’ve seen that Z(O) = {O}, and {O} = Ø. What is A({@})? 
Suppose F = {{red, green, blue}, {orange, red, blue}, {purple, red, 


green, blue} }. Find QF and UF. 
Suppose F = {{3, 7,12}, {5, 7, 16}, {5, 12, 23}}. Find QF and UF. 


Let I= {2, 3, 4, 5}, and for each i E J let A; = {i, i+ 1, i - 1, 2i}. 

List the elements of all the sets A, for i E I. 

Find (jez Ai and U.; Ai. 

Let P = {Johann Sebastian Bach, Napoleon Bonaparte, Johann Wolf- 


gang von Goethe, David Hume, Wolfgang Amadeus Mozart, Isaac 
Newton, George Washington} and let Y = {1750, 1751, 1752, ..., 


(b) 


(b) 


10. 


11. 


¥*12. 
13. 


(a) 
(b) 
(c) 
*14, 


(a) 


1759}. For each y € Y, let Ay = {p € P | the person p was alive at 
some time during the year y}. Find Uyey Ay and f yey Ay. 
Let I = {2, 3}, and for each i € I let A; = {i, 2i} and B; = {i, i + 1}. 


List the elements of the sets A; and B; for i E I. 

Find f; e z (A; U B) and (();-; Ai) Y (Nier Bi). Are they the same? 

In parts (c) and (d) of execerise 2 you analyzed the statements x © 

[;cr(Ai U Bi) and x € (();.,; Ai) U (Mier Bi). What can you conclude 

from your answer to part (b) about whether or not these statements 

are equivalent? 

(a) Analyze the logical forms of the statements x e ();-;(Ai \ Bi), 
x € (Ujer Ad) \ Uez Bi), and x € (Uier Ad \ (ier Bi- Do you think 
that any of these statements are equivalent to each other? 

Let I, A, and B; be defined as in exercise 8. Find (J;-;(Ai \ Bi). 

(Ujer AD \(Ujez Bi). and (Uje7 Ai)\ (Nier B- Now do you think any of 

the statements in part (a) are equivalent? 

Give an example of an index set J and indexed families of sets {A; | i 

€ I} and {B, | i © I} such that (ier (Ai 9 Bi) ¥ (Uier AD A (Uiar Bi)- 

Show that for any sets A and B, AA n B) = AA) n P(B), by 

showing that the statements x E AA N B) and x E AA) Nn AB) 

are equivalent. (See Example 2.3.3.) 

Give examples of sets A and B for which P(A U B) = AA) U P(B). 

Verify the following identities by writing out (using logical symbols) 

what it means for an object x to be an element of each set and then 

using logical equivalences. 

Uier(Ai U Bi) = (Ujey Ai) U (Uj; Bi)- 

(QFN )G) =f \(F UG). 

Mier (Ai \ Bi) = Mier AD \ Vier Bi). 

Sometimes each set in an indexed family of sets has two indices. For 

this problem, use the following definitions: I = {1, 2}, J = {3, 4}. For 

eachi E I andj € J, let Apa {i, j, i+ j}. Thus, for example, Ap 3 = 

{2, 3, 5}. 

For each j E J let Bj = (Jier Ai,j = A1,j U A2,;- Find B3 and By. 


(b) 
(c) 


(d) 


15. 


(b) 


16. 


Find { \j-; Bj- (Note that, replacing B; with its definition, we could say 
that jer Bi = Mjes Uiers Ai.i)») 

Find Ujer(M jes Aij). (Hint: You may want to do this in two steps, 
corresponding to parts (a) and (b).) Are ()\j-;(Uje;Ai,j) and 
User jes Aij) equal? 

Analyze the logical forms of the statements x € (jes (Uier Aij) and 
x € Uier(( jes Ai,j)- Are they equivalent? 


(a) Show that if F = ©, then the statement x E UF will be false no 


matter what x is. It follows that [JO = Ø. 
Show that if F = ©, then the statement x E NF will be true no 


matter what x is. In a context in which it is clear what the universe of 
discourse U is, we might therefore want to say that N © = U. 
However, this has the unfortunate consequence that the notation N © 
will mean different things in different contexts. Furthermore, when 
working with sets whose elements are sets, mathematicians often do 
not use a universe of discourse at all. (For more on this, see the next 
exercise.) For these reasons, some mathematicians consider the 
notation () © to be meaningless. We will avoid this problem in this 
book by using the notation {)F only in contexts in which we can be 


sure that F 4 ©. 


In Section 2.3 we saw that a set can have other sets as elements. 
When discussing sets whose elements are sets, it might seem most 
natural to consider the universe of discourse to be the set of all sets. 
However, as we will see in this problem, assuming that there is such a 
set leads to contradictions. 


Suppose U were the set of all sets. Note that in particular U is a set, 
so we would have U € U. This is not yet a contradiction; although 
most sets are not elements of themselves, perhaps some sets are 
elements of themselves. But it suggests that the sets in the universe 
U could be split into two categories: the unusual sets that, like U 
itself, are elements of themselves, and the more typical sets that are 
not. Let R be the set of sets in the second category. In other words, R 
={AGU|A ¢ A}. This means that for any set A in the universe U, 


A will be an element of R iff A € A. In other words, we have V A € 
UAER AEA). 

(a) Show that applying this last fact to the set R itself (in other words, 
plugging in R for A) leads to a contradiction. This contradiction was 
discovered by Bertrand Russell (1872—1970) in 1901, and is known 


as Russell’s paradox. 
(b) Think some more about the paradox in part (a). What do you think it 


tells us about sets? 


Proofs 


3.1 Proof Strategies 


Mathematicians are skeptical people. They use many methods, including 
experimentation with examples, trial and error, and guesswork, to try to find 
answers to mathematical questions, but they are generally not convinced 
that an answer is correct unless they can prove it. You have probably seen 
some mathematical proofs before (there were some examples in the 
introduction), but you may not have any experience writing them yourself. 
In this chapter you’ ll learn more about how proofs are put together, so you 
can start writing your own proofs. 

Proofs are a lot like jigsaw puzzles. There are no rules about how jigsaw 
puzzles must be solved. The only rule concerns the final product: all the 
pieces must fit together, and the picture must look right. The same holds for 
proofs. 

Although there are no rules about how jigsaw puzzles must be solved, 
some techniques for solving them work better than others. For example, 
you’d never do a jigsaw puzzle by filling in every other piece, and then 
going back and filling in the holes! But you also don’t do it by starting at 
the top and filling in the pieces in order until you reach the bottom. You 
probably fill in the border first, and then gradually put other chunks of the 
puzzle together and figure out where they go. Sometimes you try to put 
pieces in the wrong places, realize that they don’t fit, and feel that you’re 
not making any progress. And every once in a while you see, in a satisfying 
flash, how two big chunks fit together and feel that you’ve suddenly made a 
lot of progress. As the pieces of the puzzle fall into place, a picture 
emerges. You suddenly realize that the patch of blue you’ve been putting 
together is a lake, or part of the sky. But it’s only when the puzzle is 
complete that you can see the whole picture. 


Similar things could be said about the process of figuring out a proof. 
And I think one more similarity should be mentioned. When you finish a 
jigsaw puzzle, you don’t take it apart right away, do you? You probably 
leave it out for a day or two, so you can admire it. You should do the same 
thing with a proof. You figured out how to fit it together yourself, and once 
it’s all done, isn’t it pretty? 

In this chapter we will discuss the proof-writing techniques that 
mathematicians use most often and explain how to use them to begin 
writing proofs yourself. Understanding these techniques may also help you 
read and understand proofs written by other people. Unfortunately, the 
techniques in this chapter do not give a step-by-step procedure for solving 
every proof problem. When trying to write a proof you may make a few 
false starts before finding the right way to proceed, and some proofs may 
require some cleverness or insight. With practice your proof-writing skills 
should improve, and you’ll be able to tackle more and more challenging 
proofs. 

Mathematicians usually state the answer to a mathematical question in 
the form of a theorem that says that if certain assumptions called the 
hypotheses of the theorem are true, then some conclusion must also be true. 
Often the hypotheses and conclusion contain free variables, and in this case 
it is understood that these variables can stand for any elements of the 
universe of discourse. An assignment of particular values to these variables 
is called an instance of the theorem, and in order for the theorem to be 
correct it must be the case that for every instance of the theorem that makes 
the hypotheses come out true, the conclusion is also true. If there is even 
one instance in which the hypotheses are true but the conclusion is false, 
then the theorem is incorrect. Such an instance is called a counterexample to 
the theorem. 


Example 3.1.1. Consider the following theorem: 


Theorem. Suppose x > 3 and y < 2. Then x? - 2y > 5. 


This theorem is correct. (You are asked to prove it in exercise 15.) The 
hypotheses of the theorem are x > 3 and y < 2, and the conclusion is x? — 2y 
> 5. As an instance of the theorem, we could plug in 5 for x and 1 for y. 
Clearly with these values of the variables the hypotheses x > 3 and y <2 are 


both true, so the theorem tells us that the conclusion x? — 2 y >5 must also 


be true. In fact, plugging in the values of x and y we find that x? — 2y = 25 - 
2 = 23, and certainly 23 > 5. Note that this calculation does not constitute a 
proof of the theorem. We have only checked one instance of the theorem, 
and a proof would have to show that all instances are correct. 

If we drop the second hypothesis, then we get an incorrect theorem: 


Incorrect Theorem. Suppose x > 3. Then x? — 2y > 5. 


We can see that this theorem is incorrect by finding a counterexample. For 
example, suppose we let x = 4 and y = 6. Then the only remaining 
hypothesis, x > 3, is true, but x2 - 2y = 16 - 12 = 4, so the conclusion x? - 


2y > Sis false. 


If you find a counterexample to a theorem, then you can be sure that the 
theorem is incorrect, but the only way to know for sure that a theorem is 
correct is to prove it. A proof of a theorem is simply a deductive argument 
whose premises are the hypotheses of the theorem and whose conclusion is 
the conclusion of the theorem. Throughout the proof, we think of any free 
variables in the hypotheses and conclusion of the theorem as standing for 
some particular but unspecified elements of the universe of discourse. In 
other words, we imagine that we are reasoning about some instance of the 
theorem, but we don’t actually choose a particular instance; the reasoning in 
the proof should apply to all instances. Of course the argument should be 
valid, so we can be sure that if the hypotheses of the theorem are true for 
any instance, then the conclusion will be true for that instance as well. 

How you figure out and write up the proof of a theorem will depend 
mostly on the logical form of the conclusion. Often it will also depend on 
the logical forms of the hypotheses. The proof-writing techniques we will 
discuss in this chapter will tell you which proof strategies are most likely to 
work for various forms of hypotheses and conclusions. 

Proof-writing techniques that are based on the logical forms of the 
hypotheses usually suggest ways of drawing inferences from the 
hypotheses. When you draw an inference from the hypotheses, you use the 
assumption that the hypotheses are true to justify the assertion that some 
other statement is also true. Once you have shown that a statement is true, 
you can use it later in the proof exactly as if it were a hypothesis. Perhaps 


the most important rule to keep in mind when drawing such inferences is 
this: Never assert anything until you can justify it completely using the 
hypotheses or using conclusions reached from them earlier in the proof. 
Your motto should be: “I shall make no assertion before its time.” 
Following this rule will prevent you from using circular reasoning or 
jumping to conclusions and will guarantee that, if the hypotheses are true, 
then the conclusion must also be true. And this is the primary purpose of 
any proof: to provide a guarantee that the conclusion is true if the 
hypotheses are. 

To make sure your assertions are adequately justified, you must be 
skeptical about every inference in your proof. If there is any doubt in your 
mind about whether the justification you have given for an assertion is 
adequate, then it isn’t. After all, if your own reasoning doesn’t even 
convince you, how can you expect it to convince anybody else? 

Proof-writing techniques based on the logical form of the conclusion are 
often somewhat different from techniques based on the forms of the 
hypotheses. They usually suggest ways of transforming the problem into 
one that is equivalent but easier to solve. The idea of solving a problem by 
transforming it into an easier problem should be familiar to you. For 
example, adding the same number to both sides of an equation transforms 
the equation into an equivalent equation, and the resulting equation is 
sometimes easier to solve than the original one. Students who have studied 
calculus may be familiar with techniques of evaluating integrals, such as 
substitution or integration by parts, that can be used to transform a difficult 
integration problem into an easier one. 

Proofs that are written using these transformation strategies often include 
steps in which you assume for the sake of argument that some statement is 
true without providing any justification for that assumption. It may seem at 
first that such reasoning would violate the rule that assertions must always 
be justified, but it doesn’t, because assuming something is not the same as 
asserting it. To assert a statement is to claim that it is true, and such a claim 
is never acceptable in a proof unless it can be justified. However, the 
purpose of making an assumption in a proof is not to make a claim about 
what is true, but rather to enable you to find out what would be true if the 
assumption were correct. You must always keep in mind that any 
conclusion you reach that is based on an assumption might turn out to be 
false if the assumption is incorrect. Whenever you make a statement in a 


proof, it’s important to be sure you know whether it’s an assertion or an 
assumption. 

Perhaps an example will help clarify this. Suppose that during the course 
of a proof you decide to assume that some statement, call it P, is true, and 
you use this assumption to conclude that another statement Q is true. It 
would be wrong to call this a proof that Q is true, because you can’t be sure 
that your assumption about the truth of P was correct. All you can conclude 
at this point is that if P is true, then you can be sure that Q is true as well. In 
other words, you know that the statement P > Q is true. If the conclusion 
of the theorem being proven was Q, then the proof is incomplete at best. 
But if the conclusion was P > Q, then the proof is complete. This brings us 
to our first proof strategy. 


To prove a conclusion of the form P > Q: 
Assume P is true and then prove Q. 


Here’s another way of looking at what this proof technique means. 
Assuming that P is true amounts to the same thing as adding P to your list 
of hypotheses. Although P might not originally have been one of your 
hypotheses, once you have assumed it, you can use it exactly the way you 
would use any other hypothesis. Proving Q means treating Q as your 
conclusion and forgetting about the original conclusion. So this technique 
says that if the conclusion of the theorem you are trying to prove has the 
form P > Q, then you can transform the problem by adding P to your list 
of hypotheses and changing your conclusion from P > Q to Q. This gives 
you a new, perhaps easier proof problem to work on. If you can solve the 
new problem, then you will have shown that if P is true then Q is also true, 
thus solving the original problem of proving P > Q. How you solve this 
new problem will now be guided by the logical form of the new conclusion 
Q (which might itself be a complex statement), and perhaps also by the 
logical form of the new hypothesis P. 

Note that this technique doesn’t tell you how to do the whole proof, it 
just gives you one step, leaving you with a new problem to solve in order to 
finish the proof. Proofs are usually not written all at once, but are created 
gradually by applying several proof techniques one after another. Often the 
use of these techniques will lead you to transform the problem several 
times. In discussing this process it will be helpful to have some way to keep 


track of the results of this sequence of transformations. We therefore 
introduce the following terminology. We will refer to the statements that are 
known or assumed to be true at some point in the course of figuring out a 
proof as givens, and the statement that remains to be proven at that point as 
the goal. When you are starting to figure out a proof, the givens will be just 
the hypotheses of the theorem you are proving, but they may later include 
other statements that have been inferred from the hypotheses or added as 
new assumptions as the result of some transformation of the problem. The 
goal will initially be the conclusion of the theorem, but it may be changed 
several times in the course of figuring out a proof. 

To keep in mind that all of our proof strategies apply not only to the 
original proof problem but also to the results of any transformation of the 
problem, we will talk from now on only about givens and goals, rather than 
hypotheses and conclusions, when discussing proof-writing strategies. For 
example, the strategy stated earlier should really be called a strategy for 
proving a goal of the form P > Q, rather than a conclusion of this form. 
Even if the conclusion of the theorem you are proving is not a conditional 
statement, if you transform the problem in such a way that a conditional 
statement becomes the goal, then you can apply this strategy as the next 
step in figuring out the proof. 


Example 3.1.2. Suppose a and b are real numbers. Prove that if O < a < b 
then a? < b?. 


Scratch work 


We are given as a hypothesis that a and b are real numbers. Our conclusion 
has the form P > Q, where P is the statement 0 < a < b and Q is the 
statement a° < b?. Thus we start with these statements as given and goal: 
Givens Goal 
a and b are real numbers (0 <a <b) > (a? < b?) 

According to our proof technique we should assume that 0 < a < b and 
try to use this assumption to prove that a? < b°. In other words, we 
transform the problem by adding 0 < a < b to the list of givens and making 
a° < b? our goal: 


Givens Goal 
> > 
a and b are real numbers at < b4 
O<a<b 


Comparing the inequalities a < b and a? < b° suggests that multiplying 
both sides of the given inequality a < b by either a or b might get us closer 
to our goal. Because we are given that a and b are positive, we won’t need 
to reverse the direction of the inequality if we do this. Multiplying a < b by 
a gives us a? < ab, and multiplying it by b gives us ab < b°. Thus a? < ab < 
b2, so a2 < b’. 


Solution 


Theorem. Suppose a and b are real numbers. If 0 < a < b then a? < b°. 


Proof. Suppose 0 < a < b. Multiplying the inequality a < b by the positive 
number a we can conclude that a* < ab, and similarly multiplying by b we 
get ab < b°. Therefore a? < ab < b°, so a? < b?, as required. Thus, if 0 < a 
< b then a° < b?. 
L] 
As you can see from the preceding example, there’s a difference between 
the reasoning you use when you are figuring out a proof and the steps you 
write down when you write the final version of the proof. In particular, 
although we will often talk about givens and goals when trying to figure out 
a proof, the final write-up will generally not refer to them. Throughout this 
chapter, and sometimes in later chapters as well, we will precede our proofs 
with the scratch work used to figure out the proof, but this is just to help 
you understand how proofs are constructed. When mathematicians write 
proofs, they usually just write the steps needed to justify their conclusions 
with no explanation of how they thought of them. Some of these steps will 
be sentences indicating that the problem has been transformed (usually 
according to some proof strategy based on the logical form of the goal); 
some steps will be assertions that are justified by inferences from the givens 
(often using some proof strategy based on the logical form of a given). 
However, there will usually be no explanation of how the mathematician 
thought of these transformations and inferences. For example, the proof in 
Example 3.1.2 starts with the sentence “Suppose 0 < a < b,” indicating that 
the problem has been transformed according to our strategy, and then 


proceeds with a sequence of inferences leading to the conclusion that a* < 
b*. No other explanations were necessary to justify the final conclusion, in 
the last sentence, that if 0 < a < b then a? < b°. 

Although this lack of explanation sometimes makes proofs hard to read, 
it serves the purpose of keeping two distinct objectives separate: explaining 
your thought processes and justifying your conclusions. The first is 
psychology; the second, mathematics. The primary purpose of a proof is to 
justify the claim that the conclusion follows from the hypotheses, and no 
explanation of your thought processes can substitute for adequate 
justification of this claim. Keeping any discussion of thought processes to a 
minimum in a proof helps to keep this distinction clear. Occasionally, in a 
very complicated proof, a mathematician may include some discussion of 
the strategy behind the proof to make the proof easier to read. Usually, 
however, it is up to readers to figure this out for themselves. Don’t worry if 
you don’t immediately understand the strategy behind a proof you are 
reading. Just try to follow the justifications of the steps, and the strategy 
will eventually become clear. If it doesn’t, a second reading of the proof 
might help. 

To keep the distinction between the proof and the strategy behind the 
proof clear, in the future when we state a proof strategy we will often 
describe both the scratch work you might use to figure out the proof and the 
form that the final write-up of the proof should take. For example, here’s a 
restatement of the proof strategy we discussed earlier, in the form we will 
be using to present proof strategies from now on. 


To prove a goal of the form P > Q: 
Assume P is true and then prove Q. 


Scratch work 


Before using strategy: 


Givens Goal 


After using strategy: 


Givens Goal 
— Q 


P 


Form of final proof: 


Suppose P. 
[Proof of Q goes here.] 
Therefore P > Q. 


Note that the suggested form for the final proof tells you how the 
beginning and end of the proof will go, but more steps will have to be 
added in the middle. The givens and goal list under the heading “After 
using strategy” tells you what is known or can be assumed and what needs 
to be proven in order to fill in this gap in the proof. Many of our proof 
strategies will tell you how to write either the beginning or the end of your 
proof, leaving a gap to be filled in with further reasoning. 

There is a second method that is sometimes used for proving goals of the 
form P > Q. Because any conditional statement P > Q is equivalent to its 
contrapositive =Q — ~P, you can prove P > Q by proving 7Q > ~P 
instead, using the strategy discussed earlier. In other words: 


To prove a goal of the form P > Q: 
Assume Q is false and prove that P is false. 


Scratch work 


Before using strategy: 


Givens Goal 
a P- Q 
After using strategy: 
Givens Goal 
a =p 
=f 


Form of final proof: 


Suppose Q is false. 
[Proof of =P goes here. ] 
Therefore P > Q. 


Example 3.1.3. Suppose a, b, and c are real numbers and a > b. Prove that 
if ac < bc then c < 0. 


Scratch work 
Givens Goal 
a, b, and c are real numbers (ac = bc) > (c < 0) 
a>b 
The contrapositive of the goal is =(c < 0) > ~(ac < bc), or in other words 
(c > 0) > (ac > bc), so we can prove it by adding c > 0 to the list of givens 
and making ac > bc our new goal: 


Givens Goal 
a, b, and c are real numbers ac > be 


We can also now write the first and last sentences of the proof. According 
to the strategy, the final proof should have this form: 


Suppose c > 0. 
[Proof of ac > bc goes here.] 
Therefore, if ac < bc thenc < 0. 


Using the new given c > 0, we see that the goal ac > bc follows 
immediately from the given a > b by multiplying both sides by the positive 
number c. Inserting this step between the first and last sentences completes 
the proof. 


Solution 


Theorem. Suppose a, b, and c are real numbers and a > b. If ac < bc then c 
< 0. 


Proof. We will prove the contrapositive. Suppose c > 0. Then we can 
multiply both sides of the given inequality a > b by c and conclude that ac 


> bc. Therefore, if ac < bc then c < 0. 
L 

Notice that, although we have used the symbols of logic freely in the 
scratch work, we have not used them in the final write-up of the proof. 
Although it would not be incorrect to use logical symbols in a proof, 
mathematicians usually try to avoid it. Using the notation and rules of logic 
can be very helpful when you are figuring out the strategy for a proof, but 
in the final write-up you should try to stick to ordinary English as much as 
possible. 

You may be wondering how we knew in Example 3.1.3 that we should 
use the second method for proving a goal of the form P > Q rather than the 
first. The answer is simple: we tried both methods, and the second worked. 
When there is more than one strategy for proving a goal of a particular 
form, you may have to try a few different strategies before you hit on one 
that works. With practice, you will get better at guessing which strategy is 
most likely to work for a particular proof. 

Notice that in each of the examples we have given, our strategy involved 
making changes in our givens and goal to try to make the problem easier. 
The beginning and end of the proof, which were supplied for us in the 
statement of the proof technique, serve to tell a reader of the proof that 
these changes have been made and how the solution to this revised problem 
solves the original problem. The rest of the proof contains the solution to 
this easier, revised problem. 

Most of the other proof techniques in this chapter also suggest that you 
revise your givens and goal in some way. These revisions result in a new 
proof problem, and in every case the revisions have been designed so that a 
solution to the new problem, when combined with some beginning or 
ending sentences explaining these revisions, would also solve the original 
problem. This means that whenever you use one of these strategies you can 
write a sentence or two at the beginning or end of the proof and then forget 
about the original problem and work instead on the new problem, which 
will usually be easier. Often you will be able to figure out a proof by using 
the techniques in this chapter to revise your givens and goal repeatedly, 
making the remaining problem easier and easier until you reach a point at 
which it is completely obvious that the goal follows from the givens. 


Exercises 


a 


Consider the following theorem. (This theorem was proven in the 
introduction.) 


Theorem. Suppose n is an integer larger than 1 and n is not prime. 
Then 2" — 1 is not prime. 


(a) 


(b) 


Identify the hypotheses and conclusion of the theorem. Are the 
hypotheses true when n = 6? What does the theorem tell you in 
this instance? Is it right? 

What can you conclude from the theorem in the case n = 15? 
Check directly that this conclusion is correct. 

What can you conclude from the theorem in the case n = 11? 


Consider the following theorem. (The theorem is correct, but we 
will not ask you to prove it here.) 


Theorem. Suppose that b* > 4ac. Then the quadratic equation ax? + 
bx + c = 0 has exactly two real solutions. 


(a) 
(b) 


*4. 


Identify the hypotheses and conclusion of the theorem. 

To give an instance of the theorem, you must specify values for 
a, b, and c, but not x. Why? 

What can you conclude from the theorem in the case a = 2, b= 
—5, c = 3? Check directly that this conclusion is correct. 

What can you conclude from the theorem in the case a = 2, b= 4, 
c=3? 


Consider the following incorrect theorem: 


Incorrect Theorem. Suppose n is a natural number larger than 
2, and n is not a prime number. Then 2n + 13 is not a prime 
number. 

What are the hypotheses and conclusion of this theorem? Show 
that the theorem is incorrect by finding a counterexample. 
Complete the following alternative proof of the theorem in 
Example 3.1.2. 


Proof. Suppose 0< a < b. Then b - a > 0. 


13. 


14. 


*15. 


16. 


[Fill in a proof of b? — a? > 0 here.] 


Since b? — a? > 0, it follows that a? < b?. Therefore if 0 < a < b 
then a? < b?, 
T 


Suppose a and b are real numbers. Prove that if a < b < 0 then a? 
> b’. 

Suppose a and b are real numbers. Prove that if 0 < a < b then 
1/b < 1/a. 


Suppose that a is a real number. Prove that if a? > a then a? > a. 
(Hint: One approach is to start by completing the following 
equation: a? — a =(a? - a) - 2.) 

Suppose A \ B © C n D and x € A. Prove that if x € D then x € 
B. 


Suppose A N B € C \ D. Prove that if x € A, then if x € D then x 
EB. 

Suppose a and b are real numbers. Prove that if a < b then (a + 
b)/2 < b. 

Suppose x is a real number and x # 0. Prove that if 
($x +5)/(x7+6) = 1⁄x then x # 8. 

Suppose a, b, c, and d are real numbers, 0 < a < b, and d > 0. 
Prove that if ac => bd then c > d. 

Suppose x and y are real numbers, and 3x + 2y < 5. Prove that if x 
> 1theny< 1. 

Suppose that x and y are real numbers. Prove that if x? + y = -3 
and 2x - y = 2 then x = -1. 

Prove the first theorem in Example 3.1.1. (Hint: You might find it 
useful to apply the theorem from Example 3.1.2.) 

Consider the following theorem. 


Theorem. Suppose x is a real number and x # 4. If (2x -5)/(x-4) = 3 
then x = 7. 


(a) What’s wrong with the following proof of the theorem? 


Proof. Suppose x = 7. Then (2x —5)/(x-4) = (2(7)-5)K7-4)= 9/3 
= 3. Therefore if (2x — 5)/(x — 4) = 3 then x = 7. 
L 
(b) Give a correct proof of the theorem. 


17. Consider the following incorrect theorem: 
Incorrect Theorem. Suppose that x and y are real numbers and x 
# 3. If x?°y = 9y then y = 0. 


(a) What’s wrong with the following proof of the theorem? 


Proof. Suppose that x°y = 9y. Then (x? — 9)y = 0. Since x # 3, x? #9, so 
x° — 9 = 0. Therefore we can divide both sides of the equation (x? — 9)y 
= 0 by x? — 9, which leads to the conclusion that y = 0. Thus, if x¢y = 
9y then y = 0. 
L 
(b) Show that the theorem is incorrect by finding a counterexample. 


3.2 Proofs Involving Negations and Conditionals 


We turn now to proofs in which the goal has the form ~P. Usually it’s easier 
to prove a positive statement than a negative statement, so it is often helpful 
to reexpress a goal of the form ~P before proving it. Instead of trying to 
prove a goal that says what shouldn't be true, see if you can rephrase it as a 
goal that says what should be true. Fortunately, we have already studied 
several equivalences that will help with this reexpression. Thus, our first 
strategy for proving negated statements is: 


To prove a goal of the form ~P: 


If possible, reexpress the goal in some other form and then use one of the 
proof strategies for this other goal form. 


Example 3.2.1. Suppose A n C & Banda € C. Prove that a € A\ B. 


Scratch work 


Givens Goal 
ANCCB agA\B 
aec 


To prove the goal, we must show that it cannot be the case that a € A and 
a Ė B. Because this is a negative goal, we try to reexpress it as a positive 
statement: 


a ¢ A \ B is equivalent to —(a € A Aa ¢ B) (definition of A \ B), 
which is equivalent toa ¢ Ava eB (De Morgan’s law), 


which is equivalent to a € A > a € B (conditional law). 


Rewriting the goal in this way gives us: 


Givens Goal 
ANCCB acA>aeceB 
aeCc 


We now prove the goal in this new form, using the first strategy from 
Section 3.1. Thus, we add a € A to our list of givens and make a € B our 
goal: 


Givens Goal 
ANCCB aeB 
aec 
acA 


The proof is now easy: from the givens a € A and a € C we can conclude 
that a € A N C, and then, since A n C GC B, it follows that a € B. 


Solution 


Theorem. Suppose A n C S B and a € C. Then a Ẹ A\B. 


Proof. Suppose a € A. Then since a € C, a € An C. But then since A n C 
C B it follows that a € B. Thus, it cannot be the case that a is an element of 
A but not B, so a € A\ B. 

O 


Sometimes a goal of the form ~P cannot be reexpressed as a positive 
statement, and therefore this strategy cannot be used. In this case it is 
usually best to do a proof by contradiction. Start by assuming that P is true, 


and try to use this assumption to prove something that you know is false. 
Often this is done by proving a statement that contradicts one of the givens. 
Because you know that the statement you have proven is false, the 
assumption that P was true must have been incorrect. The only remaining 
possibility then is that P is false. 


To prove a goal of the form ~P: 


Assume P is true and try to reach a contradiction. Once you have reached 
a contradiction, you can conclude that P must be false. 


Scratch work 


Before using strategy: 


Givens Goal 
= =P 


After using strategy: 


Givens Goal 
— Contradiction 


P 
Form of final proof: 
Suppose P is true. 


[Proof of contradiction goes here. | 
Thus, P is false. 


Example 3.2.2. Prove that if x? + y = 13 and y # 4 then x #3. 
Scratch work 


The goal is a conditional statement, so according to the first proof strategy 
in Section 3.1 we can treat the antecedent as given and make the consequent 
our new goal: 
Givens Goal 
vr+y=13 x #3 
yF#4 


This proof strategy also suggests what form the final proof should take. 
According to the strategy, the proof should look like this: 


Suppose x° + y= 13 andy #4. 
[Proof of x # 3 goes here. | 
Thus, if x? + y = 13 andy #4 then x #3. 


In other words, the first and last sentences of the final proof have already 
been written, and the problem that remains to be solved is to fill in a proof 
of x # 3 between these two sentences. The givens—goal list summarizes 
what we know and what we have to prove in order to solve this problem. 

The goal x # 3 means -=(x = 3), but because x = 3 has no logical 
connectives in it, none of the equivalences we know can be used to 
reexpress this goal in a positive form. We therefore try proof by 
contradiction and transform the problem as follows: 

Givens Goal 
x? + y= 13 Contradiction 
y#4 


x=3 


Once again, the proof strategy that suggested this transformation also 
tells us how to fill in a few more sentences of the final proof. As we 
indicated earlier, these sentences go between the first and last sentences of 
the proof, which were written before. 


Suppose x° + y= 13 andy #4. 
Suppose x # 3. 
[Proof of contradiction goes here. ] 
Therefore x # 3. 
Thus, if x? + y = 13 andy #4 then x #3. 


The indenting in this outline of the proof will not be part of the final 
proof. We have done it here to make the underlying structure of the proof 
clear. The first and last lines go together and indicate that we are proving a 
conditional statement by assuming the antecedent and proving the 
consequent. Between these lines is a proof of the consequent, x # 3, which 
we have set off from the first and last lines by indenting it. This inner proof 


has the form of a proof by contradiction, as indicated by its first and last 
lines. Between these lines we still need to fill in a proof of a contradiction. 

At this point we don’t have a particular statement as our goal; any 
impossible conclusion will do. We must therefore look more closely at the 
givens to see if some of them contradict others. In this case, the first and 
third together imply that y = 4, which contradicts the second. 


Solution 


Theorem. If x? + y = 13 and y # 4 then x #3. 


Proof. Suppose x? + y = 13 and y # 4. Suppose x = 3. Substituting this into 
the equation x? + y = 13, we get 9 + y= 13, so y = 4. But this contradicts the 
fact that y # 4. Therefore x # 3. Thus, if x? + y = 13 and y #4 then x #3. 

L 

You may be wondering at this point why we were justified in concluding, 
when we reached a contradiction in the proof, that x # 3. After all, the 
second list of givens in our scratch work contained three givens. How could 
we be sure, when we reached a contradiction, that the culprit was the third 
given, x = 3? To answer this question, look back at the first givens and goal 
analysis for this example. According to that analysis, there were two givens, 
x? + y = 13 and y # 4, from which we had to deduce the goal x # 3. Those 
givens were introduced as assumptions in the first sentence of the proof. 
Our proof that x # 3 took place in a context in which those assumptions 
were in force, as indicated by the indenting in the outline of the proof in our 
scratch work. Thus, we only had to show that x # 3 under the assumption 
that x? + y = 13 and y # 4. When we reached a contradiction, we didn’t need 
to figure out which of the three statements in the second list of givens was 
false. We were certainly justified in concluding that if neither of the first 
two was the culprit, then it had to be the third, and that was all that was 
required to finish the proof. 

Proving a goal by contradiction has the advantage that it allows you to 
assume that your conclusion is false, providing you with another given to 
work with. But it has the disadvantage that it leaves you with a rather vague 
goal: produce a contradiction by proving something that you know is false. 
Because all the proof strategies we have discussed so far depend on 
analyzing the logical form of the goal, it appears that none of them will help 


you to achieve the goal of producing a contradiction. In the preceding proof 
we were forced to look more closely at our givens to find a contradiction. In 
this case we did it by proving that y = 4, contradicting the given y # 4. This 
illustrates a pattern that occurs often in proofs by contradiction: if one of the 
givens has the form ~P, then you can produce a contradiction by proving P. 
This is our first strategy based on the logical form of a given. 


To use a given of the form ~P: 


If you’re doing a proof by contradiction, try making P your goal. If you 
can prove P, then the proof will be complete, because P contradicts the 
given 7P. 


Scratch work 


Before using strategy: 


Givens Goal 


=P Contradiction 
After using strategy: 
Givens Goal 
=P P 
Form of final proof: 
[Proof of P goes here.] 


Since we already know -—P, this is a contradiction. 


Although we have recommended proof by contradiction for proving 
goals of the form —P, it can be used for any goal. Usually it’s best to try the 
other strategies first if any of them apply; but if you’re stuck, you can try 
proof by contradiction in any proof. 

The next example illustrates this and also another important rule of proof 
writing: in many cases the logical form of a statement can be discovered by 
writing out the definition of some mathematical word or symbol that occurs 
in the statement. For this reason, knowing the precise statements of the 


definitions of all mathematical terms is extremely important when you’re 
writing a proof. 


Example 3.2.3. Suppose A, B, and C are sets, A \ B S C, and x is anything 
at all. Prove that if x € A\ C then x € B. 


Scratch work 


We’re given that A \ B © C, and our goal is x € A \C > x € B. Because the 
goal is a conditional statement, our first step is to transform the problem by 
adding x € A \ Cas a second given and making x € B our goal: 
Givens Goal 
A\BCC xeB 
xE€A\C 


The form of the final proof will therefore be as follows: 
Suppose x €A\C. 


[Proof of x € B goes here. | 
Thus, if x € A \ C then x € B. 


The goal x € B contains no logical connectives, so none of the techniques 
we have studied so far apply, and it is not obvious why the goal follows 
from the givens. Lacking anything else to do, we try proof by contradiction: 


Givens Goal 
A\BCC Contradiction 
x EA\C 
xéB 


As before, this transformation of the problem also enables us to fill in a few 
more sentences of the proof: 


Suppose x €A\C. 
Suppose x É B. 
[Proof of contradiction goes here. ] 
Therefore x € B. 
Thus, if x € A\ C then x € B. 


Because we’re doing a proof by contradiction and our last given is now a 
negated statement, we could try using our strategy for using givens of the 
form ~P. Unfortunately, this strategy suggests making x € B our goal, 
which just gets us back to where we started. We must look at the other 
givens to try to find the contradiction. 

In this case, writing out the definition of the second given is the key to 
the proof, since this definition also contains a negated statement. By 
definition, x € A \ C means x € A and x & C. Replacing this given by its 
definition gives us: 

Givens Goal 
A\BCC Contradiction 


> 


D 


A A M 
A 


Now the third given also has the form ~P, where P is the statement x € 
C, so we can apply the strategy for using givens of the form ~P and make x 
€ C our goal. Showing that x € C would complete the proof because it 
would contradict the given x € C. 


Givens Goal 
A\BCC x EC 
rea 
rec 
x€éB 


Once again, we can add a little more to the proof we are gradually 
writing by filling in the fact that we plan to derive our contradiction by 
proving x € C. We also add the definition of x € A \ C to the proof, inserting 
it in what seems like the most logical place, right after we stated that x € A \ 
C: 


Suppose x € A\ C. This means that x € A and x € C. 
Suppose x Æ B. 
[Proof of x € C goes here. ] 
This contradicts the fact that x € C. 
Therefore x € B. 
Thus, if x € A\ C thenx € B. 


We have finally reached a point where the goal follows easily from the 
givens. From x € A and x & B we conclude that x € A \ B. Since A\B S C 
it follows that x € C. 


Solution 


Theorem. Suppose A, B, and C are sets, A \ B © C, and x is anything at all. 
Ifx € A\ C then x €B. 


Proof. Suppose x € A \ C. This means that x € A and x É C. Suppose x É B. 
Then x € A \ B, so since A \ B & C, x € C. But this contradicts the fact that x 
É C. Therefore x € B. Thus, if x € A \ C then x € B. 
L] 
The strategy we’ve recommended for using givens of the form ~P only 
applies if you are doing a proof by contradiction. For other kinds of proofs, 
the next strategy can be used. This strategy is based on the fact that givens 
of the form ~P, like goals of this form, may be easier to work with if they 
are reexpressed as positive statements. 


To use a given of the form ~P: 
If possible, reexpress this given in some other form. 


We have discussed strategies for working with both givens and goals of 
the form ~P, but only strategies for goals of the form P > Q. We now fill 
this gap by giving two strategies for using givens of the form P > Q. We 
said before that many strategies for using givens suggest ways of drawing 
inferences from the givens. Such strategies are called rules of inference. 
Both of our strategies for using givens of the form P > Q are examples of 
rules of inference. 


To use a given of the form P > Q: 

If you are also given P, or if you can prove that P is true, then you can 
use this given to conclude that Q is true. Since it is equivalent to ~Q > ~P, 
if you can prove that Q is false, you can use this given to conclude that P is 
false. 


The first of these rules of inference says that if you know that both P and 
P > Q are true, you can conclude that Q must also be true. Logicians call 
this rule modus ponens. We saw this rule used in one of our first examples 


of valid deductive reasoning in Chapter 1, argument 2 in Example 1.1.1. 
The validity of this form of reasoning was verified using the truth table for 
the conditional connective in Section 1.5. 

The second rule, called modus tollens, says that if you know that P > Q 
is true and Q is false, you can conclude that P must also be false. The 
validity of this rule can also be checked with truth tables, as you are asked 
to show in exercise 14. Usually you won’t find a given of the form P > Q 
to be much use until you are able to prove either P or ~Q. However, if you 
ever reach a point in your proof where you have determined that P is true, 
you should probably use this given immediately to conclude that Q is true. 
Similarly, if you ever establish ~Q, immediately use this given to conclude 
=P. 

Although most of our examples will involve specific mathematical 
statements, occasionally we will do examples of proofs containing letters 
standing for unspecified statements. Later in this chapter we will be able to 
use this method to verify some of the equivalences from Chapter 2 that 
could only be justified on intuitive grounds before. Here’s an example of 
this kind, illustrating the use of modus ponens and modus tollens. 


Example 3.2.4. Suppose P > (Q > R). Prove that ~R > (P > ~Q). 
Scratch work 


This could actually be done with a truth table, as you are asked to show in 
exercise 15, but let’s do it using the proof strategies we’ve been discussing. 
We start with the following situation: 
Givens Goal 
P — (Q > R) =R —> (P > >Q) 


Our only given is a conditional statement. By the rules of inference just 
discussed, if we knew P we could use modus ponens to conclude Q > R, 
and if we knew ~(Q > R) we could use modus tollens to conclude ~P. 
Because we don’t, at this point, know either of these, we can’t yet do 
anything with this given. If either P or -(Q — R) ever gets added to the 
givens list, then we should consider using modus ponens or modus tollens. 
For now, we need to concentrate on the goal. 

The goal is also a conditional statement, so we assume the antecedent and 
set the consequent as our new goal: 


Givens Goal 
P —> (Q > R) P > ~Q 
=R 


We can also now write a little bit of the proof: 


Suppose ~R. 
[Proof of P > ~Q goes here.] 
Therefore aR > (P > ~Q). 


We still can’t do anything with the givens, but the goal is another 
conditional, so we use the same strategy again: 
Givens Goal 
P —> (Q —> R) =O 
aR 
P 


Now the proof looks like this: 


Suppose =R. 
Suppose P. 
[Proof of ~Q goes here. ] 
Therefore P > ~Q. 
Therefore ~R > (P > ~Q). 


We’ve been watching for our chance to use our first given by applying 
either modus ponens or modus tollens, and now we can do it. Since we 
know P > (Q > R) and P, by modus ponens we can infer Q > R. Any 
conclusion inferred from the givens can be added to the givens column: 

Givens Goal 
P => (Q > R) =0 
=R 
P 
Q>R 


We also add one more line to the proof: 


Suppose =R. 
Suppose P. 
Since P and P > (Q > R), it follows that Q > R. 


[Proof of ~Q goes here. ] 
Therefore P > ~Q. 
Therefore aR > (P > ~Q). 


Finally, our last step is to use modus tollens. We now know Q > R and 
=R, so by modus tollens we can conclude ~Q. This is our goal, so the proof 
is done. 


Solution 


Theorem. Suppose P > (Q > R). Then =R > (P > ~Q). 


Proof. Suppose ~R. Suppose P. Since P and P > (Q > R), it follows that Q 
— R. But then, since =R, we can conclude =Q. Thus, P > ~Q. Therefore 
aR > (P > aQ). 
L] 
Sometimes if you’re stuck you can use rules of inference to work 
backward. For example, suppose one of your givens has the form P > Q 
and your goal is Q. If only you could prove P, you could use modus ponens 
to reach your goal. This suggests treating P as your goal instead of Q. If you 
can prove P, then yov’ll just have to add one more step to the proof to reach 
your original goal Q. 


Example 3.2.5. Suppose that A © B, a € A, anda É B \ C. Prove that a € 
C. 


Scratch work 


Givens Goal 
ACB aec 
acA 
aéB\C 


Our third given is a negative statement, so we begin by reexpressing it as 
an equivalent positive statement. According to the definition of the 
difference of two sets, this given means -(a € B A a É C), and by one of 
De Morgan’s laws, this is equivalent to a € B V a € C. Because our goal is 
a € C, it is probably more useful to rewrite this in the equivalent form a € B 
> a€C: 


Givens Goal 
ACB aec 
acA 
aeB—>aecC 


Now we can use our strategy for using givens of the form P > Q. Our 
goal is a € C, and we are given that a € B > a € C. If we could prove that 
a € B, then we could use modus ponens to reach our goal. So let’s try 
treating a € B as our goal and see if that makes the problem easier: 

Givens Goal 
aeB 


Now it is clear how to reach the goal. Since a € A and A € B, a € B. 
Solution 


Theorem. Suppose that A S B, a € A, and a € B\ C. Then a € C. 


Proof. Since a € A and A G B, we can conclude that a € B. But a € B\C, 


so it follows that a € C. 
O 


Exercises 


*1. This problem could be solved by using truth tables, but don’t do 
it that way. Instead, use the methods for writing proofs discussed 
so far in this chapter. (See Example 3.2.4.) 
(a) Suppose P > Q and Q -> R are both true. Prove that P > R is 
true. 
(b) Suppose =R > (P > ~Q) is true. Prove that P > (Q > R) is 
true. 
2. This problem could be solved by using truth tables, but don’t do 
it that way. Instead, use the methods for writing proofs discussed 
so far in this chapter. (See Example 3.2.4.) 


(a) 


(b) 
3: 


¥*12. 


(a) 


(b) 
13. 


Suppose P > Q and R > ~Q are both true. Prove that P > ~R is 
true. 
Suppose that P is true. Prove that Q > ~(Q > ~P) is true. 


Suppose A © C, and B and C are disjoint. Prove that if x € A then 
xĖB. 


Suppose that A \ B is disjoint from C and x € A. Prove that if x € 
C then x € B. 

Prove that it cannot be the case that x € A \ B and x € B\C. 

Use the method of proof by contradiction to prove the theorem in 
Example 3.2.1. 


Use the method of proof by contradiction to prove the theorem in 
Example 3.2.5. 


Suppose that y + x = 2y — x, and x and y are not both zero. Prove 
that y # 0. 

Suppose that a and b are nonzero real numbers. Prove that if a < 
1/a < b < 1% thena < -1. 

Suppose that x and y are real numbers. Prove that if x?y = 2x + y, 
then if y 4 0 then x # 0. 

Suppose that x and y are real numbers. Prove that if x # 0, then if 
y = (3x? + 2y)/(x? + 2) then y = 3. 

Consider the following incorrect theorem: 


Incorrect Theorem. Suppose x and y are real numbers and x + y 
= 10. Then x # 3 andy #8. 


What’s wrong with the following proof of the theorem? 


Proof. Suppose the conclusion of the theorem is false. Then x = 3 
and y = 8. But then x + y = 11, which contradicts the given 
information that x + y = 10. Therefore the conclusion must be 
true. 

L 
Show that the theorem is incorrect by finding a counterexample. 


Consider the following incorrect theorem: 


Incorrect Theorem. Suppose that A © C, B © C, and x € A. 
Then x € B. 


(a) What’s wrong with the following proof of the theorem? 


Proof. Suppose that x É B. Since x € A and A S C, x € C. Since 
x € Band B & C, x € C. But now we have proven both x € C 
and x € C, so we have reached a contradiction. Therefore x € B. 
L 
(b) Show that the theorem is incorrect by finding a counterexample. 


14. Use truth tables to show that modus tollens is a valid rule 

of inference. 
*15. Use truth tables to check the correctness of the theorem in 

Example 3.2.4. 

16. Use truth tables to check the correctness of the statements 
in exercise 1. 

17. Use truth tables to check the correctness of the statements 
in exercise 2. 

18. Can the proof in Example 3.2.2 be modified to prove that if 
x*+y=13andx#3 then y # 4? Explain. 


3.3 Proofs Involving Quantifiers 


Look again at Example 3.2.3. In that example we said that x could be 
anything at all, and we proved the statement x € A\C > x € B. Because the 
reasoning we used would apply no matter what x was, our proof actually 
shows that x € A \ C > x € B is true for all values of x. In other words, we 
can conclude Vx(x € A\C > x €B). 

This illustrates the easiest and most straightforward way of proving a 
goal of the form VxP(x). If you can give a proof of the goal P(x) that would 
work no matter what x was, then you can conclude that VxP(x) must be 
true. To make sure that your proof would work for any value of x, it is 
important to start your proof with no assumptions about x. Mathematicians 
express this by saying that x must be arbitrary. In particular, you must not 
assume that x is equal to any other object already under discussion in the 
proof. Thus, if the letter x is already being used in the proof to stand for 
some particular object, then you cannot use it to stand for an arbitrary 
object. In this case you must choose a different variable that is not already 


being used in the proof, say y, and replace the goal VxP(x) with the 
equivalent statement VyP(yP). Now you can proceed by letting y stand for 
an arbitrary object and proving P(y). 


To prove a goal of the form VxP(x): 

Let x stand for an arbitrary object and prove P(x). The letter x must be a 
new variable in the proof. If x is already being used in the proof to stand for 
something, then you must choose an unused variable, say y, to stand for the 
arbitrary object, and prove P(y). 


Scratch work 


Before using strategy: 


Givens Goal 
— Vx P(x) 


After using strategy: 


Givens Goal 
— P(x) 


Form of final proof: 


Let x be arbitrary. 
[Proof of P(x) goes here.] 
Since x was arbitrary, we can conclude that VxP(x). 


Example 3.3.1. Suppose A, B, and C are sets, and A \ B € C. Prove that A 
\CGRB. 


Scratch work 
Givens Goal 
A\BCC A\CCB 


As usual, we look first at the logical form of the goal to plan our strategy. 
In this case we must write out the definition of © to determine the logical 
form of the goal. 


Givens Goal 
A\BCC Vx(x €e A\C > x €B) 


Because the goal has the form VxP(x), where P(x) is the statement x € A \ 
C > x € B, we will introduce a new variable x into the proof to stand for an 
arbitrary object and then try to prove x € A\ C > x € B. Note that x is a 
new variable in the proof. It appeared in the logical form of the goal as a 
bound variable, but remember that bound variables don’t stand for anything 
in particular. We have not yet used x as a free variable in any statement, so 
it has not been used to stand for any particular object. To make sure x is 
arbitrary we must be careful not to add any assumptions about x to the 
givens column. However, we do change our goal: 
Givens Goal 
A\BCC xEeEA\C>xEB 


According to our strategy, the final proof should look like this: 


Let x be arbitrary. 
[Proof of x E€ A\C > x € B goes here. | 


Since x was arbitrary, we can conclude that Vx(x € A \C > x € B), so A \ 
CCB. 


The problem is now exactly the same as in Example 3.2.3, so the rest of 
the solution is the same as well. In other words, we can simply insert the 
proof we wrote in Example 3.2.3 between the first and last sentences of the 
proof written here. 


Solution 


Theorem. Suppose A, B, and C are sets, and A \ B & C. ThenA\C C B. 


Proof. Let x be arbitrary. Suppose x € A \ C. This means that x € A and x Ẹ 
C. Suppose x É B. Then x € A \ B, so since A \ B © C, x € C. But this 
contradicts the fact that x É C. Therefore x € B. Thus, if x € A \ C then x € 
B. Since x was arbitrary, we can conclude that Vx(x € A\C > x € B), so A \ 
CCB. 
T 
Notice that, although this proof shows that every element of A \ C is also 
an element of B, it does not contain phrases such as “every element of A \ 


C” or “all elements of A \ C.” For most of the proof we simply reason about 
x, which is treated as a single, fixed element of A \ C. We pretend that x 
stands for some particular element of A \ C, being careful to make no 
assumptions about which element it stands for. It is only at the end of the 
proof that we observe that, because x was arbitrary, our conclusions about x 
would be true no matter what x was. This is the main advantage of using 
this strategy to prove a goal of the form VxP(x). It enables you to prove a 
goal about all objects by reasoning about only one object, as long as that 
object is arbitrary. If you are proving a goal of the form VxP(x) and you 
find yourself saying a lot about “all x’s” or “every x,” you are probably 
making your proof unnecessarily complicated by not using this strategy. 


As we saw in Chapter 2, statements of the form Vx(P(x) > Q(x)) are 
quite common in mathematics. It might be worthwhile, therefore, to 
consider how the strategies we’ve discussed can be combined to prove a 
goal of this form. Because the goal starts with Vx, the first step is to let x be 
arbitrary and try to prove P(x) > Q(x). To prove this goal, you will 
probably want to assume that P(x) is true and prove Q(x). Thus, the proof 
will probably start like this: “Let x be arbitrary. Suppose P(x).” It will then 
proceed with the steps needed to reach the goal Q(x). Often in this type of 
proof the statement that x is arbitrary is left out, and the proof simply starts 
with “Suppose P(x).” When a new variable x is introduced into a proof in 
this way, it is usually understood that x is arbitrary. In other words, no 
assumptions are being made about x other than the stated one that P(x) is 
true. 

An important example of this type of proof is a proof in which the goal 
has the form Vx € A P(x). Recall that Vx € A P(x) means the same thing as 
Vx(x € A > P(x)), so according to our strategy the proof should start with 
“Suppose x € A” and then proceed with the steps needed to conclude that 
P(x) is true. Once again, it is understood that no assumptions are being 
made about x other than the stated assumption that x € A, so x stands for an 
arbitrary element of A. 

Mathematicians sometimes skip other steps in proofs, if knowledgeable 
readers could be expected to fill them in themselves. In particular, many of 
our proof strategies have suggested that the proof end with a sentence that 
sums up why the reasoning that has been given in the proof leads to the 
desired conclusion. In a proof in which several of these strategies have been 


combined, there might be several of these summing up sentences, one after 
another, at the end of the proof. Mathematicians often condense this 
summing up into one sentence, or even skip it entirely. When you are 
reading a proof written by someone else, you may find it helpful to fill in 
these skipped steps. 


Example 3.3.2. Suppose A and B are sets. Prove that if A n B= A then A 
Cc B. 


Scratch work 


Our goal is An B=A > AC B. Because the goal is a conditional 
statement, we add the antecedent to the givens list and make the consequent 
the goal. We will also write out the definition of © in the new goal to show 
what its logical form is. 
Givens Goal 
ANB=A Vu(x €A>xXEB) 
Now the goal has the form Vx(P(x) > Q(x)), where P(x) is the statement 
x € A and Q(x) is the statement x € B. We therefore let x be arbitrary, 
assume x € A, and prove x € B: 
Givens Goal 
ANB=A xeB 


Combining the proof strategies we have used, we see that the final proof 
will have this form: 


Suppose A n B=A. 
Let x be arbitrary. 
Suppose x € A. 
[Proof of x € B goes here.] 
Therefore x €A > x EB. 
Since x was arbitrary, we can conclude that Vx(x € A > x € B), so A S 
B. 
Therefore, if A n B= A then A C B. 


As discussed earlier, when we write up the final proof we can skip the 
sentence “Let x be arbitrary,” and we can also skip some or all of the last 
three sentences. 

We have now reached the point at which we can analyze the logical form 
of the goal no further. Fortunately, when we look at the givens, we discover 
that the goal follows easily. Since x € A and A n B =A, it follows that x € A 
Nn B, so x € B. (In this last step we are using the definition of n:x EA n B 
means x € A and x € B.) 


Solution 


Theorem. Suppose A and B are sets. If A n B= AthenA S B. 


Proof. Suppose A N B = A, and suppose x € A. Then since A n B=A,x EA 
Nn B, so x € B. Since x was an arbitrary element of A, we can conclude that 
ACB. 
L 
Proving a goal of the form JxP(x) also involves introducing a new 
variable x into the proof and proving P(x), but in this case x will not be 
arbitrary. Because you only need to prove that P(x) is true for at least one x, 
it suffices to assign a particular value to x and prove P(x) for this one value 
of x. 


To prove a goal of the form 4xP(x): 

Try to find a value of x for which you think P(x) will be true. Then start 
your proof with “Let x = (the value you decided on)” and proceed to prove 
P(x) for this value of x. Once again, x should be a new variable. If the letter 
x is already being used in the proof for some other purpose, then you should 
choose an unused variable, say y, and rewrite the goal in the equivalent 
form dyP(y). Now proceed as before by starting your proof with “Let y = 
(the value you decided on)” and prove P(y). 


Scratch work 


Before using strategy: 


Givens Goal 
— dv P(x) 


After using strategy: 


Givens Goal 
— P(x) 


x = (the value you decided on) 
Form of final proof: 


Let x = (the value you decided on). 
[Proof of P(x) goes here.] 
Thus, 3xP(x). 


Finding the right value to use for x may be difficult in some cases. One 
method that is sometimes helpful is to assume that P(x) is true and then see 
if you can figure out what x must be, based on this assumption. If P(x) is an 
equation involving x, this amounts to solving the equation for x. However, 
if this doesn’t work, you may use any other method you please to try to find 
a value to use for x, including trial-and-error and guessing. The reason you 
have such freedom with this step is that the reasoning you use to find a 
value for x will not appear in the final proof. This is because of our rule that 
a proof should contain only the reasoning needed to justify the conclusion 
of the proof, not an explanation of how you thought of that reasoning. To 
justify the conclusion that 4xP(x) is true it is only necessary to verify that 
P(x) comes out true when x is assigned some particular value. How you 
thought of that value is your own business, and not part of the justification 
of the conclusion. 


Example 3.3.3. Prove that for every real number x, if x > 0 then there is a 
real number y such that y(y + 1) =x. 


Scratch work 


In symbols, our goal is Vx(x > 0 > Ayly(y + 1) = x]), where the variables x 
and y in this statement are understood to range over R. We therefore start by 
letting x be an arbitrary real number, and we then assume that x > 0 and try 
to prove that dy[y(y + 1) = x]. Thus, we now have the following given and 
goal: 


Givens Goal 
x>0 dy[y(y + 1) = x] 


Because our goal has the form dyP(y), where P(y) is the statement y(y + 
1) = x, according to our strategy we should try to find a value of y for which 
P(y) is true. In this case we can do it by solving the equation y(y + 1) = x for 
y. It’s a quadratic equation and can be solved using the quadratic formula: 


_ EZEZ 


yyr1=x $+ ye+y—-x=0 > y= = 


Note that y1 +4x is defined, since we have x > 0 as a given. We have 
actually found two solutions for y, but to prove that dy[y(y + 1) = x] we 
only need to exhibit one value of y that makes the equation y(y + 1) = x true. 
Either of the two solutions could be used in the proof. We will use the 
solution y = (—1 + V1 + 4x)/2. 

The steps we’ve used to solve for y should not appear in the final proof. 
In the final proof we will simply say “Let y = (—1 + /1+4x)/2” and then 
prove that y(y + 1) = x. In other words, the final proof will have this form: 


Let x be an arbitrary real number. 
Suppose x > 0. 
Let y = (-1+ V1 + 4x)/2. 
[Proof of y(y + 1) = x goes here. ] 
Thus, dy[y(y + 1) = x]. 
Therefore x > 0 > Ay[y(y + 1) =x]. 
Since x was arbitrary, we can conclude that Vx(x > 0 > dyly(y +1) = x). 


To see what must be done to fill in the remaining gap in the proof, we 


add y = (—1 + /1 + 4x)/2 to the givens list and make y(y + 1) = x the goal: 
Givens Goal 
x>0 y(y+1)=x 
y= (—I + V l + 4x )/2 


We can now prove that the equation y(y +1) = x is true by simply 
substituting (—1 + vI + 4x)/2 for y and verifying that the resulting equation 
is true. 


Solution 


Theorem. For every real number x, if x > 0 then there is a real number y 
such that y(y + 1) =x. 


Proof. Let x be an arbitrary real number, and suppose x > 0. Let 
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which is defined since x > 0. Then 
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Sometimes when you’re proving a goal of the form JyQ(y) you won’t be 
able to tell just by looking at the statement Q(y) what value you should plug 
in for y. In this case you may want to look more closely at the givens to see 
if they suggest a value to use for y. In particular, a given of the form 3xP(x) 
may be helpful in this situation. This given says that an object with a certain 
property exists. It is probably a good idea to imagine that a particular object 
with this property has been chosen and to introduce a new variable, say xo, 


into the proof to stand for this object. Thus, for the rest of the proof you will 
be using x, to stand for some particular object, and you can assume that 


with x, standing for this object, P(xg) is true. In other words, you can add 
P(xo) to your givens list. This object x5, or something related to it, might 
turn out to be the right thing to plug in for y to make Q(y) come out true. 


To use a given of the form 4xP(x): 

Introduce a new variable x, into the proof to stand for an object for which 
P(X) is true. This means that you can now assume that P(x) is true. 
Logicians call this rule of inference existential instantiation. 

Note that using a given of the form 4xP(x) is very different from proving 
a goal of the form 4xP(x), because when using a given of the form 4xP(x), 


you don’ get to choose a particular value to plug in for x. You can assume 
that xo stands for some object for which P(x,) is true, but you can’t assume 


anything else about xp. On the other hand, a given of the form VxP(x) says 
that P(x) would be true no matter what value is assigned to x. You can 
therefore choose any value you wish to plug in for x and use this given to 
conclude that P(x) is true. 


To use a given of the form VxP(x): 
You can plug in any value, say a, for x and use this given to conclude that 
P(a) is true. This rule is called universal instantiation. 


Usually, if you have a given of the form JxP(x), you should apply 
existential instantiation to it immediately. A good guideline is: if you know 
something exists, you should give it a name. On the other hand, you won’t 
be able to apply universal instantiation to a given of the form VxP(x) unless 
you have a particular value a to plug in for x, so you might want to wait 
until a likely choice for a pops up in the proof. For example, consider a 
given of the form Vx(P(x) > Q(x)). You can use this given to conclude that 
P(a) > Qa) for any a, but according to our rule for using givens that are 
conditional statements, this conclusion probably won’t be very useful unless 
you know either P(a) or =Q(a). You should probably wait until an object a 
appears in the proof for which you know either P(a) or =Q(a), and plug this 
a in for x when it appears. 

We’ve already used this technique in some of our earlier proofs when 
dealing with givens of the form A © B. For instance, in Example 3.2.5 we 
used the givens A S B and a € A to conclude that a € B. The justification 
for this reasoning is that A © B means Vx(x € A > x € B), so by universal 
instantiation we can plug in a for x and conclude that a € A > a € B. Since 
we also know a € A, it follows by modus ponens that a € B. 


Example 3.3.4. Suppose 7 and Q are families of sets and ZF n G# ©. 
Prove that NFE UG. 


Scratch work 


Our first step in analyzing the logical form of the goal is to write out the 
meaning of the subset symbol, which gives us the statement Vx(x E€ NF > 


x € Ug). We could go further with this analysis by writing out the 


definitions of union and intersection, but the part of the analysis that we 
have already done will be enough to allow us to decide how to get started 
on the proof. The definitions of union and intersection will be needed later 
in the proof, but we will wait until they are needed before filling them in. 
When analyzing the logical forms of givens and goals in order to figure out 
a proof, it is usually best to do only as much of the analysis as is needed to 
determine the next step of the proof. Going further with the logical analysis 
usually just introduces unnecessary complication, without providing any 
benefit. 


Because the goal means Vx(x € NF > x € U9), we let x be arbitrary, 
assume x € {}.4 and try to prove x € UG. 


Givens Goal 
FAG xel|Jg 
xe(\F 


The new goal means JA € Gx € A), so to prove it we should try to find a 


value that will “work” for A. Just looking at the goal doesn’t make it clear 
how to choose A, so we look more closely at the givens. We begin by 
writing them out in logical symbols: 
Givens Goal 
FJA(A € FNG) JA € G(x € A) 
VA € F(x € A) 


The second given starts with VA, so we may not be able to use this given 
until a likely value to plug in for A pops up during the course of the proof. 
In particular, we should keep in mind that if we ever come across an 
element of 7 while trying to figure out the proof, we can plug it in for A in 


the second given and conclude that it contains x as an element. The first 
given, however, starts with JA, so we should use it immediately. It says that 
there is some object that is an element of Z n G& By existential 


instantiation, we can introduce a name, say Ag, for this object. Thus, we can 


treat Ay € F Nn Yas a given from now on. Because we now have a name, Ag, 
for a particular element of A G it would be redundant to continue to 
discuss the given statement JA(A € Fn G), so we will drop it from our list 
of givens. Since our new given Ay E€ Z n G means Ap € FA and Ay E G, we 


now have the following situation: 


Givens Goal 
Aoe F JA € G(x € A) 
Ao = G 
VAE F(x € A) 


If you’ve been paying close attention, you should know what the next 
step should be. We decided before to keep our eyes open for any elements 
of A that might come up during the proof, because we might want to plug 


them in for A in the last given. An element of F has come up: A! Plugging 


Apo in for A in the last given, we can conclude that x € Aj. Any conclusions 


can be treated in the future as givens, so you can add this statement to the 
givens column if you like. 


Remember that we decided to look at the givens because we didn’t know 
what value to assign to A in the goal. What we need is a value for A that is 
in G and that will make the statement x € A come out true. Has this 


consideration of the givens suggested a value to use for A? Yes! Use A = Ap. 
Although we translated the given statements x € QZ; x € UG and Fn G 


# © into logical symbols in order to figure out how to use them in the 
proof, these translations are not usually written out when the proof is 
written up in final form. In the final proof we just write these statements in 
their original form and leave it to the reader of the proof to work out their 
logical forms in order to follow our reasoning. 


Solution 


Theorem. Suppose Fand Gare families of sets, and Fn G# ©. Then QF 
S US. 


Proof. Suppose x € (Z: Since Fn G# ©, we can let Ag be an element of 
F n G. Thus, Ay € Fand Ap E & Since x € NF and Ay €E F it follows that 
x € Ap. But we also know that Ay € G, so we can conclude that x € UG. 


L 

Proofs involving the quantifiers for all and there exists are often difficult 
for them. 

That last sentence confused you, didn’t it? You’re probably wondering, 
“Who are they?” Readers of your proofs will experience the same sort of 
confusion if you use variables without explaining what they stand for. 
Beginning proof-writers are sometimes careless about this, and that’s why 
proofs involving the quantifiers for all and there exists are often difficult for 
them. (It made more sense that time, didn’t it?) When you use the strategies 
we’ve discussed in this section, you’ll be introducing new variables into 
your proof, and when you do this, you must always be careful to make it 
clear to the reader what they stand for. 


For example, if you were proving a goal of the form Vx € A P(x), you 
would probably start by introducing a variable x to stand for an arbitrary 
element of A. Your reader won’t know what x means, though, unless you 
begin your proof with “Let x be an arbitrary element of A,” or “Suppose x € 
A.” These sentences tell the reader that, from now on, he or she should think 
of x as standing for some particular element of A, although which element it 
stands for is left unspecified. Of course, you must be clear in your own 
mind about what x means. In particular, because x is to be arbitrary, you 
must be careful not to assume anything about x other than the fact that x € 
A. It might help to think of the value of x as being chosen by someone else; 
you have no control over which element of A they’ll pick. Using a given of 
the form 4xP(x) is similar. This given tells you that you can introduce a 
new variable x, into the proof to stand for some object for which P(xọ) is 
true, but you cannot assume anything else about x9. On the other hand, if 
you are proving AxP(x), your proof will probably start “Let x = ....” This 
time you get to choose the value of x, and you must tell the reader explicitly 
that you are choosing the value of x and what value you have chosen. 

It’s also important, when you’re introducing a new variable x, to be sure 
you know what kind of object x stands for. Is it a number? a set? a function? 
a matrix? You’d better not write a € X unless X is a set, for example. If you 


aren’t careful about this, you might end up writing nonsense. You also 
sometimes need to know what kind of object a variable stands for to figure 
out the logical form of a statement involving that variable. For example, A = 
B means Vx(x € A x € B) if Aand B are sets, but not if they’re numbers. 

The most important thing to keep in mind about introducing variables 
into a proof is simply the fact that variables must always be introduced 
before they are used. If you make a statement about x (i.e., a statement in 
which x occurs as a free variable) without first explaining what x stands for, 
a reader of your proof won’t know what you’re talking about — and there’s a 
good chance that you won’t know what you’re talking about either! 

Because proofs involving quantifiers may require more practice than the 
other proofs we have discussed so far, we end this section with two more 
examples. 


Example 3.3.5. Suppose B is a set and 7 is a family of sets. Prove that if 
U F S B then FE AB). 


Scratch work 


We assume [J ZS B and try to prove FSE A(B). Because this goal means 
Vx(x € F > x € A(B)), we let x be arbitrary, assume x € Æ and set x € 
P(B) as our goal. Recall that Zis a family of sets, so since x € Z x is a set. 
Thus, we now have the following givens and goal: 
Givens Goal 
UFEB x € P(B) 
xeF 
To figure out how to prove this goal, we must use the definition of power 
set. The statement x € “(B) means x © B, or in other words Vy(y € x > y 
€ B). We must therefore introduce another arbitrary object into the proof. 
We let y be arbitrary, assume y € x, and try to prove y € B. 
Givens Goal 
JF CB yeB 
xeF 


yex 


The goal can be analyzed no further, so we must look more closely at the 
givens. Our goal is y € B, and the only given that even mentions B is the 
first. In fact, the first given would enable us to reach this goal, if only we 
knew that y € U Z This suggests that we might try treating y € U “as our 


goal. If we can reach this goal, then we can just add one more step, 
applying the first given, and the proof will be done. 


Givens Goal 
LJF CB yeUF 
xeF 


yex 


Once again, we have a goal whose logical form can be analyzed, so we 
use the form of the goal to guide our strategy. The goal means JA € Aly € 


A), so to prove it we must find a set A such that A € Fand y € A. Looking 


at the givens, we see that x is such a set, so the proof is done. 


Solution 


Theorem. Suppose B is a set and Fis a family of sets. If UA S B then F S 
P(B). 


Proof. Suppose UA S B. Let x be an arbitrary element of Z Let y be an 
arbitrary element of x. Since y € x and x € 7; by the definition of U.Z; y € 
U Z: But then since U Z€ B, y € B. Since y was an arbitrary element of x, 


we can conclude that x S B, so x € A(B). But x was an arbitrary element of 
F, so this shows that FS A(B), as required. 


L 

The Venn diagram in Figure 3.1 may help you see why the theorem in 
Example 3.3.5 is true, and you might find it useful to refer to the picture as 
you reread the proof. But notice that we didn’t prove the theorem by simply 
explaining this picture; the proof was constructed by following the proof 
strategies we have discussed. There are many methods, such as drawing 
pictures or working out examples, that may help you achieve an 
understanding of why a theorem is true. But an explanation of this 


understanding is not a proof. To prove a theorem, you must follow the 
strategies in this chapter. 


Elements of F 


Figure 3.1. The small circles represent elements of Æ and the shaded region is U Æ The large 


circle represents B. 


The proof in Example 3.3.5 is probably the most complex proof we’ve 
done so far. Read it again and make sure you understand its structure and 
the purpose of every sentence. Isn’t it remarkable how much logical 
complexity has been packed into just a few lines? 

It is not uncommon for a short proof to have such a rich logical structure. 
This efficiency of exposition is one of the most attractive features of proofs, 
but it also often makes them difficult to read. Although we’ve been 
concentrating so far on writing proofs, it is also important to learn how to 
read proofs written by other people. To give you some practice with this, 
we present our last proof in this section without the scratch work. See if you 
can follow the structure of the proof as you read it. We’ll provide a 
commentary after the proof that should help you to understand it. 

For this proof, we need the following definition: 


Definition 3.3.6. For any integers x and y, we’ ll say that x divides y (or y is 
divisible by x) if dk € Z (kx = y). We use the notation x | y to mean “x 


divides y,” and x + y means “x does not divide y.” 


For example, 4 | 20, since 5 - 4 = 20, but 4 21. 


Theorem 3.3.7. For all integers a, b, and c, ifa | b and b | c thena | c. 


Proof. Let a, b, and c be arbitrary integers and suppose a | b and b | c. Since 
a | b, we can choose some integer m such that ma = b. Similarly, since b | c, 
we can choose an integer n such that nb = c. Therefore c = nb = nma, so 
since nm is an integer, a | c. 

O 


Commentary. The theorem says Va € ZVb € ZVc € Z(a|b A b|c > alc), 


so the most natural way to proceed is to let a, b, and c be arbitrary integers, 
assume a | b and b | c, and then prove a | c. The first sentence of the proof 
indicates that this strategy is being used, so the goal for the rest of the proof 
must be to prove that a | c. The fact that this is the goal for the rest of the 
proof is not explicitly stated. You are expected to figure this out for yourself 
by using your knowledge of proof strategies. You might even want to make 
a givens and goal list to help you keep track of what is known and what 
remains to be proven as you continue to read the proof. At this point in the 
proof, the list would look like this: 
Givens Goal 

a, b, and c are integers a|ec 

a|b 

ble 


Because the new goal means dk € Z(ka = c), the proof will probably 


proceed by finding an integer k such that ka = c. As with many proofs of 
existential statements, the first step in finding such a k involves looking 
more closely at the givens. The next sentence of the proof uses the given a | 
b to conclude that we can choose an integer m such that ma = b. The proof 
doesn’t say what rule of inference justifies this. It is up to you to figure it 
out by working out the logical form of the given statement a | b, using the 
definition of divides. Because this given means dk € Z(ka = b), you should 


recognize that the rule of inference being used is existential instantiation. 
Existential instantiation is also used in the next sentence of the proof to 
justify choosing an integer n such that nb = c. The equations ma = b and nb 
= c can now be added to the list of givens. 

Some steps have also been skipped in the last sentence of the proof. We 
expected that the goal a | c would be proven by finding an integer k such 
that ka = c. From the equation c = nma and the fact that nm is an integer, it 
follows that k = nm will work, but the proof doesn’t explicitly say that this 


value of k is being used; in fact, the variable k does not appear at all in the 
proof. Of course, the variable k does not appear in the statement of the 
theorem either. A reader of the proof would expect us to prove that a | c by 
finding an integer that, when multiplied by a, gives the value c, but based 
on reading the statement of the theorem, the reader would have no reason to 
expect this integer to be given the name k. Assigning this name to the 
integer nm would therefore not have made the proof easier to understand, so 
we didn’t do it. 


Exercises 


Note: Exercises marked with the symbol PD can be done with Proof 
Designer, which is computer software that is available free on the internet. 


*1. In exercise 7 of Section 2.2 you used logical equivalences to 
show that 4x(P(x) > Q(x)) is equivalent to VxP(x) > AxQ(x). 
Now use the methods of this section to prove that if Ax(P(x) > 
Q(x)) is true, then VxP(x) > AxQ(x) is true. (Note: The other 
direction of the equivalence is quite a bit harder to prove. See 
exercise 30 of Section 3.5.) 


2. Prove that if A and B \ C are disjoint, then An BEC. 
*3. Prove that if A © B\ C then A and C are disjoint. 
P>4. Suppose A © P(A). Prove that P(A) E Y(Y(A)). 


5. The hypothesis of the theorem proven in exercise 4 is AG AA). 


(a) Can you think of a set A for which this hypothesis is true? 
(b) Can you think of another? 


6. Suppose x is a real number. 


(a) Prove that if x # 1 then there is a real number y such that “45 = x. 


(b) Prove that if there is a real number y such that +5 = x, then x # 
1. 
*7. Prove that for every real number x, if x > 2 then there is a real 
number y such that y + 1/y =x. 
P8. Prove that if Zis a family of sets and A € 7 then A E U Æ 


*g. 
10. 


11. 


Pp*12. 


13. 


*14. 


15. 


P16. 


*17. 


18. 


(a) 
(b) 
19. 


(b) 


*20. 


Prove that if Fis a family of sets and A € 7, then NFS A. 
Suppose that Fis a nonempty family of sets, B is a set, and VA € 
Z (BC A). Prove that B € QÆ 


Suppose that is a family of sets. Prove that if © € Athen NF 


= ©, 

Suppose “and @ are families of sets. Prove that if 7G g then 
UFSG. 

Suppose “and ¢ are nonempty families of sets. Prove that if F 
© Gthen()GSMF 

Suppose that {A; | i € I} is an indexed family of sets. Prove that 
Ujer AAD © P(r Ai). (Hint: First make sure you know what 
all the notation means!) 

Suppose {A; | i € I} is an indexed family of sets and I = ©. Prove 
that jer Ai € Nier P(Ai). 

Prove the converse of the statement proven in Example 3.3.5. In 
other words, prove that if FSE A(B) then U ZE B. 

Suppose # and Q are nonempty families of sets, and every 


element of Fis a subset of every element of G. Prove that U Z S 


NG. 

In this problem all variables range over Z, the set of all integers. 
Prove that if a | b and a | c, then a | (b + c). 

Prove that if ac | bc and c # 0, then a | b. 


(a) Prove that for all real numbers x and y there is a real number 
z such that x +z =y-z. 


Would the statement in part (a) be correct if “real number” were 
changed to “integer”? Justify your answer. 


Consider the following theorem: 
Theorem. For every real number x, x° > 0. 
What’s wrong with the following proof of the theorem? 


21. 


(a) 


(b) 


*22. 


23. 


(a) 


Proof. Suppose not. Then for every real number x, x? < 0. In 
particular, plugging in x = 3 we would get 9 < 0, which is clearly 
false. This contradiction shows that for every number x, x? > 0. 

L 
Consider the following incorrect theorem: 


Incorrect Theorem. If Vx € A(x 4 0) and A © B then Wx € B(x 4 
0). 

What’s wrong with the following proof of the theorem? 

Proof. Suppose that Vx € A(x # 0) and A C B. Let x be an 
arbitrary element of A. Since Vx € A(x 4 0), we can conclude 
that x 4 0. Also, since A © B, x € B. Since x € B, x # 0, and x 
was arbitrary, we can conclude that Vx € B(x # 0). O 

Find a counterexample to the theorem. In other words, find an 
example of sets A and B for which the hypotheses of the theorem 
are true but the conclusion is false. 


Consider the following incorrect theorem: 
Incorrect Theorem. 4x € RVy € R(xy? = y - x). 
What’s wrong with the following proof of the theorem? 


Proof. Let x = y/(y* + 1). Then 


Consider the following incorrect theorem: 


Incorrect Theorem. Suppose Fand Gare families of sets. If UZ 
and (JẸ are disjoint, then so are Fand G. 


What’s wrong with the following proof of the theorem? 


Proof. Suppose U.Z and UG are disjoint. Suppose 7 and G are 


not disjoint. Then we can choose some set A such that A € 7 and 


(b) 
24. 


(a) 


(b) 
gris? 


26. 


(b) 


A € G. Since A €E Z, by exercise 8, A S U.Z; so every element of 
A is in UF Similarly, since A € G, every element of A is in UG. 
But then every element of A is in both [JFZ and UG, and this is 
impossible since [JF and UG are disjoint. Thus, we have 


reached a contradiction, so Zand ¢ must be disjoint. 


Find a counterexample to the theorem. 

Consider the following putative theorem: 

Theorem? For all real numbers x and y, x? + xy — 2y? = 0. 
What’s wrong with the following proof of the theorem? 


Proof. Let x and y be equal to some arbitrary real number r. Then 
tay = 27 SF rere 2 = 0, 


Since x and y were both arbitrary, this shows that for all real 
numbers x and y, x? + xy — 2y? = 0. 

L 
Is the theorem correct? Justify your answer with either a proof or 
a counterexample. 
Prove that for every real number x there is a real number y such 
that for every real number z, yz = (x + z)* — (x? + z°). 
(a) Comparing the various rules for dealing with quantifiers in 
proofs, you should see a similarity between the rules for goals of 
the form VxP(x) and givens of the form 4xP(x). What is this 
similarity? What about the rules for goals of the form 3xP(x) and 
givens of the form VxP(x)? 
Can you think of a reason why these similarities might be 
expected? (Hint: Think about how proof by contradiction works 
when the goal starts with a quantifier.) 


3.4 Proofs Involving Conjunctions and 
Biconditionals 


The method for proving a goal of the form P A Q is very simple: 


To prove a goal of the form P A Q: 
Prove P and Q separately. 


In other words, a goal of the form P A Q is treated as two separate goals: P, 
and Q. The same is true of givens of the form P A Q: 


To use a given of the form P A Q: 
Treat this given as two separate givens: P, and Q. 


We’ve already used these ideas, without mention, in some of our 
previous examples. For example, the definition of the given x € A \ C in 
Example 3.2.3 was x E A A x È C, but we treated it as two separate givens: 
x EA, andx EC. 


Example 3.4.1. Suppose A © B, and A and C are disjoint. Prove that A © 
B\C. 


Scratch work 


Givens Goal 
ACB ACB\C 
ANC =Ø 


Analyzing the logical form of the goal, we see that it has the form Vx(x € 
A- x € B\ C), so we let x be arbitrary, assume x € A, and try to prove that 
x € B \ C. The new goal x € B \ C means x € B A x € C, so according to 
our strategy we should split this into two goals, x € B and x Ẹ C, and prove 
them separately. 


Givens Goals 
ACB xeB 
ANC =Ø x€C 


xXxEA 


The final proof will have this form: 


Let x be arbitrary. 
Suppose x € A. 
[Proof of x € B goes here. | 
[Proof of x Æ C goes here. ] 
Thus, x € B A x É C, so x € B\ C. 
Therefore x EA > x € B\C. 
Since x was arbitrary, Vx(x € A > x € B\ C), so A S B\C. 


The first goal, x € B, clearly follows from the fact that x € A and A © B. 
The second goal, x € C, follows from x € A and A n C = Ø. You can see 
this by analyzing the logical form of the statement A n C = Ø. It is a 
negative statement, but it can be reexpressed as an equivalent positive 
statement: 


ANC = Ø is equivalent to ~Jy(y ¢ AA y eC) (definitions of N and Ø), 
which is equivalent to VYy-(y € AA y e C) (quantifier negation law), 
which is equivalent to Vy(y ¢ AV y ¢ C) (De Morgan’s law), 


which is equivalent to Yy(y € A— y C) (conditional law). 


Plugging in x for y in this last statement, we see that x € A > x É C, and 
since we already know x € A, we can conclude that x EC. 


Solution 
Theorem. Suppose A © B, and A and C are disjoint. Then A S B\ C. 


Proof. Suppose x € A. Since A © B, it follows that x € B, and since A and C 
are disjoint, we must have x È C. Thus, x € B \ C. Since x was an arbitrary 
element of A, we can conclude that A © B\ C. 
L 
Using our strategies for working with conjunctions, we can now work out 
the proper way to deal with statements of the form P = Q in proofs. 
Because P - Q is equivalent to (P > Q) A (Q > P), according to our 
strategies a given or goal of the form P = Q should be treated as two 
separate givens or goals: P > Q, and Q > P. 


To prove a goal of the form P = Q: 
Prove P > Q and Q > P separately. 


To use a given of the form P = Q: 
Treat this as two separate givens: P > Q, and Q > P. 


This is illustrated in the next example, in which we use the following 
definitions. 


Definition 3.4.2. An integer x is even if dk € Z(x = 2k), and x is odd if 4k 
€ Z (x= 2k+ 1). 


We also use the fact that every integer is either even or odd, but not both. 
For a proof of this fact, see exercise 16 in Section 6.1. 


Example 3.4.3. Suppose x is an integer. Prove that x is even iff x? is even. 


Scratch work 


The goal is (x is even) +- (x? is even), so we prove the two goals (x is even) 
— (x? is even)and(x* is even) > (x is even)separately. For the first, we 
assume that x is even and prove that x? is even: 

Givens Goal 

xeZ x? is even 

x is even 
Writing out the definition of even in both the given and the goal will reveal 
their logical forms: 


Givens Goal 
xeZ ak € Z(x? = 2k) 
Jk € Z(x = 2k) 


Because the second given starts with 4k, we immediately use it and let k 
stand for some particular integer for which the statement x = 2k is true. 
Thus, we have two new given statements: k € Z, and x = 2k. 


Givens Goal 
xe Jk € Z(x? = 2k) 
ke 


4i 
ry 
4i 

a 


xX = 2k 


The goal starts with 4k, but since k is already being used to stand for a 
particular number, we cannot assign a new value to k to prove the goal. We 
must therefore switch to a different letter, say j. One way to understand this 
is to think of rewriting the goal in the equivalent form Jj € Z(x? = 2j). To 


prove this goal we must come up with a value to plug in for j. It must be an 
integer, and it must satisfy the equation x° = 2j. Using the given equation x 
= 2k, we see that x? = (2k)? = 4k? = 2(2k?), so it looks like the right value to 
choose for j is j = 2k*. Clearly 2k* is an integer, so this choice for j will 
work to complete the proof of our first goal. 


To prove the second goal (x? is even) > (x is even), we’ll prove the 
contrapositive (x is not even) > (x? is not even) instead. Since any integer 
is either even or odd but not both, this is equivalent to the statement (x is 
odd) > (x? is odd). 


Givens Goal 
ry 7. 
xeEeg x4 is odd 


x is odd 
The steps are now quite similar to the first part of the proof. As before, 
we begin by writing out the definition of odd in both the second given and 
the goal. This time, to avoid the conflict of variable names we ran into in 
the first part of the proof, we use different names for the bound variables in 
the two statements. 
Givens Goal 
xeZ Jj e Z(x? =2j + 1) 
Jk € Z(x = 2k + 1) 


Next we use the second given and let k stand for a particular integer for 
which x = 2k + 1. 
Givens Goal 
xe Jj e A(x? =2j +1) 
ke 


x=2k+1 


We must now find an integer j such that x? = 2j + 1. Plugging in 2k + 1 
for x we get x? = (2k + 1)? = 4k? + 4k + 1 = 2(2k* + 2k) + 1, so j = 2k? + 2k 
looks like the right choice. 


Before giving the final write-up of the proof, we should make a few 
explanatory remarks. The two conditional statements we’ve proven can be 
thought of as representing the two directions > and < of the biconditional 
symbol + in the original goal. These two parts of the proof are sometimes 
labeled with the symbols > and <. In each part, we end up proving a 
statement that asserts the existence of a number with certain properties. We 
called this number j in the scratch work, but note that j was not mentioned 
explicitly in the statement of the problem. As in the proof of Theorem 3.3.7, 
we have chosen not to mention j explicitly in the final proof either. 


Solution 
Theorem. Suppose x is an integer. Then x is even iff x? is even. 


Proof. (>) Suppose x is even. Then for some integer k, x = 2k. Therefore, 
x? = 4k? = 2(2k?), so since 2k* is an integer, x° is even. Thus, if x is even 
then x? is even. 


(-) Suppose x is odd. Then x = 2 k + 1 for some integer k. Therefore, x? 


= (2k + 1)? = 4k? + 4k + 1 = 2(2k* + 2k) + 1, so since 2k? + 2k is an integer, 
x? is odd. Thus, if x? is even then x is even. 
L 
Using the proof techniques we’ve developed, we can now verify some of 
the equivalences that we were only able to justify on intuitive grounds in 
Chapter 2. As an example of this, let’s prove that the formulas Vx-P(x) and 
~əąxP(x) are equivalent. To say that these formulas are equivalent means 
that they will always have the same truth value. In other words, no matter 
what statement P(x) stands for, the statement Vx-P(x) @ ~3xP(x) will be 
true. We can prove this using our technique for proving biconditional 
statements. 


Example 3.4.4. Prove that Vx-P(x) @ 7=5xP(x). 
Scratch work 


(>) We must prove Yxa P(x) > 7dxP(x), so we assume Vx- P(x) and try 
to prove ~4xP(x). Our goal is now a negated statement, and reexpressing it 
would require the use of the very equivalence that we are trying to prove! 


We therefore fall back on our only other strategy for dealing with negative 
goals, proof by contradiction. We now have the following situation: 


Givens Goal 
VxaP(x) Contradiction 
dx P(x) 


The second given starts with an existential quantifier, so we use it 
immedi-ately and let xọ stand for some object for which the statement P(X,) 


is true. But now plugging in xọ for x in the first given we can conclude that 
=P(xo), which gives us the contradiction we need. 


(-) For this direction of the biconditional we should assume 74 xP(x) 
and try to prove Vx-P(x). Because this goal starts with a universal 
quantifier, we let x be arbitrary and try to prove P(x). Once again, we now 
have a negated goal that can’t be reexpressed, so we use proof by 
contradiction: 

Givens Goal 
dx P(x) Contradiction 
P(x) 


Our first given is also a negated statement, and this suggests that we 
could get the contradiction we need by proving 4xP(x). We therefore set 
this as our goal. 


Givens Goal 
sdx P(x) dx P(x) 
P(x) 


To keep from confusing the x that appears as a free variable in the second 
given (the arbitrary x introduced earlier in the proof) with the x that appears 
as a bound variable in the goal, you might want to rewrite the goal in the 
equivalent form dyP(y). To prove this goal we have to find a value of y that 
makes P(y) come out true. But this is easy! Our second given, P(x), tells us 
that our arbitrary x is the value we need. 


Solution 


Theorem. Vx-P(x) e 7=AdxP(x). 


Proof. (—) Suppose Vx-P(x), and suppose 4xP(x). Then we can choose 
some X, such that P(x) is true. But since Vx-P(x), we can conclude that 


=P(x 9), and this is a contradiction. Therefore Vx-P(x) > 7=AxP(x). 


(-) Suppose ~3xP(x). Let x be arbitrary, and suppose P(x). Since we 
have a specific x for which P(x) is true, it follows that 4xP(x), which is a 
contradiction. Therefore, =P(x). Since x was arbitrary, we can conclude that 
Vx-P(x), so 7d xP(x) > Vx-P(x). 

L 

Sometimes in a proof of a goal of the form P = Q the steps in the proof 
of Q> P are the same as the steps used to prove P > Q, but in reverse 
order. In this case you may be able to simplify the proof by writing it as a 
string of equivalences, starting with P and ending with Q. For example, 
suppose you found that you could prove P > Q by first assuming P, then 
using P to infer some other statement R, and then using R to deduce Q; and 
suppose that the same steps could be used, in reverse order, to prove that Q 
> P. In other words, you could assume Q, use this assumption to conclude 
that R was true, and then use R to prove P. Since you would be asserting 
both P > R and R > P, you could sum up these two steps by saying P e- 
R. Similarly, the other two steps of the proof tell you that R + Q. These two 
statements imply the goal P ~ Q. Mathematicians sometimes present this 
kind of proof by simply writing the string of equivalences 


P iff R iff Q. 


You can think of this as an abbreviation for “P iff R and R iff Q (and 
therefore P iff Q).” This is illustrated in the next example. 


Example 3.4.5. Suppose A, B, and C are sets. Prove that A n (B\C)= (A 
n B)\C. 


Scratch work 


As we saw in Chapter 2, the equation A N (B \ C) = (A n B) \ C means Vx(x 
EAN (B\C) e x€(An B)\C), but it is also equivalent to the statement 
[An (B\C) S (An B)\CI A [An B\C E An (B\ OC). This suggests 
two approaches to the proof. We could let x be arbitrary and then prove x € 
An (B\C) e x € (An B)\C, or we could prove the two statements A N 
(B\ C) €E (An B)\ Cand (An B)\C EAn (B\ C). In fact, almost every 


proof that two sets are equal will involve one of these two approaches. In 
this case we will use the first approach, so once we have introduced our 
arbitrary x, we will have an iff goal. 
For the (> ) half of the proof we assume x € A N (B \ C) and try to prove 
x €(An B)\C: 
Givens Goal 
XEAN(B\C) x E(ANB)\C 


To see the logical forms of the given and goal, we write out their 
definitions as follows: 
xEAN(B\C)iffx eAAxeE B\Ciffx EeAAXE BAX EC: 
xE(ANB)\Ciffx EANBAXECifEXEAAXEBAXEC. 


At this point it is clear that the given implies the goal, since the last steps 
in both strings of equivalences turned out to be identical. In fact, it is also 
clear that the reasoning involved in the (<) direction of the proof will be 
exactly the same, but with the given and goal columns reversed. Thus, we 
might try to shorten the proof by writing it as a string of equivalences, 
starting with x € A N (B \ C) and ending with x € (A n B) \ C. In this case, 
if we start with x € A N (B \ C) and follow the first string of equivalences 
displayed above, we come to a statement that is the same as the last 
statement in the second string of equivalences. We can then continue by 
following the second string of equivalences backward, ending with x € (A 
n B)\C. 


Solution 
Theorem. Suppose A, B, and C are sets. Then A n (B\ C) = (An B)\C. 


Proof. Let x be arbitrary. Then 
xEAN(B\C)iffx eAAxEe B\C 
ffx e€AAxEBAx€C 
iffx e(ANB)Ax€C 
iff x e (AN B)\C. 
Thus, Vx(x EA n (B\C) e x €(An B)\C),soAn(B\C=(An B)\C. 
L] 


The technique of figuring out a sequence of equivalences in one order 
and then writing it in the reverse order is used quite often in proofs. The 
order in which the steps should be written in the final proof is determined 
by our rule that an assertion should never be made until it can be justified. 
In particular, if you are trying to prove P = Q, it is wrong to start your 
write-up of the proof with the unjustified statement P - Q and then work 
out the meanings of the two sides P and Q, showing that they are the same. 
You should instead start with equivalences you can justify and string them 
together to produce a justification of the goal P = Q before you assert this 
goal. A similar technique can sometimes be used to figure out proofs of 
equations, as the next example shows. 


Example 3.4.6. Prove that for any real numbers a and b, 
(a +b)? - 4(a - b)? =(3b - a)(3a - b). 
Scratch work 


The goal has the form VaVb((a + b}? — 4(a — b}? = (3b — a)(3a — b)), so we 
start by letting a and b be arbitrary real numbers and try to prove the 
equation. Multiplying out both sides gives us: 
(a+ b)* — 4(a — b)? =a? +2ab+ b? — 4(a* — 2ab of b?) 
= —3a” + 10ab — 3b?; 
(3b — a) (3a — b) = 9ab — 3a? — 3b? + ab = —3a? + 10ab — 3b?. 
Clearly the two sides are equal. The simplest way to write the proof of 
this is to write a string of equalities starting with (a +b)? —4(a - b)? and 
ending with (3b — a)(3a — b). We can do this by copying down the first 


string of equalities displayed above, and then following it with the last line, 
written backward. 


Solution 
Theorem. For any real numbers a and b, 
(a +b)? - 4(a — b)? =(3b - a)(3a - b). 


Proof. Let a and b be arbitrary real numbers. Then 


(a +b)? — 4(a — b? = a* + 2ab + b? — 4(a? — 2ab + b?) 
= —3a? + 10ab — 3p 


= 9ab — 3a” — 3b? + ab = (3b — a)(3a — b). 


El 


We end this section by presenting another proof without preliminary 
scratch work, but with a commentary to help you read the proof. 


Theorem 3.4.7. For every integer n, 6 | n iff 2 | n and 3 |n. 


Proof. Let n be an arbitrary integer. 
(>) Suppose 6 |n. Then we can choose an integer k such that 6 k = n. 
Therefore n = 6k = 2(3k), so 2 | n, and similarly n = 6k = 3(2k), so 3 | n. 
(-) Suppose 2 | n and 3 |n. Then we can choose integers j and k such that 
n = 2j and n = 3k. Therefore 6(j — k) = 6j — 6k = 3(2j) - 2(8k) = 3n - 2n = n, 
so 6 |n. 
L 


Commentary. The statement to be proven is V n € Z[6 | n -((2 | n) A(3 


|n))], and the most natural strategy for proving a goal of this form is to let n 
be arbitrary and then prove both directions of the biconditional separately. It 
should be clear that this is the strategy being used in the proof. 

For the left-to-right direction of the biconditional, we assume 6 | n and 
then prove 2 | n and 3 | n, treating this as two separate goals. The 
introduction of the integer k is justified by existential instantiation, since the 
assumption 6 | n means Jk € Z(6k = n). At this point in the proof we have 
the following givens and goals: 

Givens Goals 
neZ 2 |n 
keZ 3n 
6k=n 

The first goal, 2 | n, means Jj € Z(2j = n), so we must find an integer j 
such that 2j = n. Although the proof doesn’t say so explicitly, the equation 
n= 2(3k), which is derived in the proof, suggests that the value being used 


for j is j = 3k. Clearly, 3k is an integer (another step skipped in the proof), 
so this choice for j works. The proof of 3 | n is similar. 


For the right-to-left direction we assume 2 | n and 3 | n and prove 6 | n. 
Once again, the introduction of j and k is justified by existential 
instantiation. No explanation is given for why we should compute 6(j — k), 
but a proof need not provide such explanations. The reason for the 
calculation should become clear when, surprisingly, it turns out that 6(j — k) 
=n. Such surprises provide part of the pleasure of working with proofs. As 
in the first half of the proof, since j — k is an integer, this shows that 6 | n. 


Exercises 


sa 


Pp2. 
Pp3. 
Pp*4. 
PpD. 


Use the methods of this chapter to prove that Vx(P(x) A Q(x)) is 
equivalent to VxP(x) A VxQ(x). 
Prove that if A E BandA © CthenA SC Bn C. 
Suppose A C B. Prove that for every set C, C\ B S C \A. 
Prove that if A © Band A Æ C then B £ C. 
Prove that if A © B\ C andA# Ø then B Æ C. 


Prove that for any sets A, B, and C, A \ (B n C) = (A \ B) U (A \ 

C), by finding a string of equivalences starting with x € A \ (B n 

C) and ending with x € (A \ B) U (A \ C). (See Example 3.4.5.) 
Use the methods of this chapter to prove that for any sets A and 

B, P(An B) = AA) n AB). 

Prove that A S B iff A(A) S P(B). 

Prove that if x and y are odd integers, then xy is odd. 

Prove that if x and y are odd integers, then x — y is even. 

Prove that for every integer n, n? is even iff n is even. 

Consider the following putative theorem: 


Theorem? Suppose m is an even integer and n is an odd integer. 
Then n? -m° = n +m. 


(a) 


What’s wrong with the following proof of the theorem? 


Proof. Since m is even, we can choose some integer k such that m = 
2k. Similarly, since n is odd we have n = 2 k + 1. Therefore 


n? — m? = (2k + 1)? — (2k)* = 4k? + 4k +1 — 4k? = 4k +1 
= (2kK+1)+ (2k) =n-+mm. 


(b) Is the theorem correct? Justify your answer with either a proof or 
a counterexample. 


*13. Prove that Vx € R[dy E€ R(x + y=xy) e x= 1). 

14. Prove that dz € RVx € R*[Ay € R(y - x = y⁄x) e x=Z]. 

Pp15. Suppose B is a set and Fis a family of sets. Prove that U{A \ B | 
A E€ A S UE \ ZB). 

*16. Suppose 7 and G are nonempty families of sets and every 
element of 7 is disjoint from some element of G. Prove that UF 
and ()@ are disjoint. 

Pp17. Prove that for any set A, A = U P(A). 

Pp*18. Suppose “7 and Gare families of sets. 
(a) Prove that UZ n Y E (UA n (U9. 
(b) Whats wrong with the following proof that (UA) n (UM E 


U(Fn A? 


Proof. Suppose x € (UJ) n (U9). This means that x € U.Z and x € 
UG, so JA € F(x € A) and JA € Gx € A). Thus, we can choose a set 
A such that A € A AEG and x € A. Since A € FandAEGAE Fn 
G. Therefore 3A € Fn Kx E A), sox E U(ZF n ĝ. Since x was 
arbitrary, we can conclude that (UA n (UA E UC(EFn À. 


L] 
(c) Find an example of families of sets Zand G for which U(7n 9) 


4 (UA) n (UX). 
Pp19. Suppose Zand G are families of sets. Prove that (UA) n (UY 
c UFa QAiff VA € INB E GAnNBEU(Fn Ø). 


P20. 


Pp21. 


(a) 
(b) 


Suppose Zand Gare families of sets. Prove that U.Z and U Gare 
disjoint iff for all A € Zand B € & A and B are disjoint. 


Suppose “and (are families of sets. 

Prove that (UA) \ (UM E UCA A). 

What’s wrong with the following proof that U(7~\ A S (UA \ 
(US)? 


Proof. Suppose x € U(*\ 9. Then we can choose some A € F\ G 
such that x € A. Since A € F\ G, A € Fand A/ € G. Since x € A and 
AE Fx €UF Sincex €AandAE g, x € UG. Therefore x € (UA 
\UJ. 


(c) 


(d) 


Py*22. 


23. 


(a) 


O 
Prove that U.A\ A € (UA \ (UD iff VA € (F\ AVB E€ XA n 


B= Ø). 

Find an example of families of sets Zand ¢ for which U(7\ 0) 
* UA \ UJ. 

Suppose Fand G are families of sets. Prove that if UZU UG 
then there is some A € “such that for all B € G, A U B. 

Suppose B is a set, {A; | i € I } is an indexed family of sets, and I 
ZO. 

What proof strategies are used in the following proof of the 
equation B N (U; ¢ Aj) = U; er (B ^ A)? 

Proof. Let x be arbitrary. Suppose x € B n (U; - Aj). Then x € B 
and x € U; -; Aj, so we can choose some i, € I such that x € Ai, 
Since x € B and x € A; x E€ Bn A,. Therefore x € Uj er (B ^ 
A;). 


Now suppose x € U; ez (B ^ A;). Then we can choose some ig 
E I such that x € B n A,. Therefore x € B and x € A; Since x € 
Aip X € Ui e Ai- Since x € B and x € Uj eA, x EB Nn (Use! 
Aj). 
Since x was arbitrary, we have shown that Vx[x € B n(U; er 
Aj) > x € U; er (B ^ A], so B ^ (U; e14) = U; er ^ A). 
L 
(b) Prove that B \ (U; e14 = U; ez (B \ Ap. 
(c) Can you discover and prove a similar theorem about B\(U; e ; 
Aj)? (Hint: Try to guess the theorem, and then try to prove it. If 


you can’t finish the proof, it might be because your guess was 
wrong. Change your guess and try again.) 


*24. Suppose {A; |i €I} and {B, |i € I } are indexed families of sets 

and I # Ø. 

(a) Prove that U; ez (A; \ B) S (U; e 14i) \ (U; e1 Bi). 

(b) Find an example for which U; ez (A; \ Bp 4 (U; e ;A)\(U; e r Bò. 

25. Suppose {Ai |i € I } and {Bi| i € I } are indexed families of sets. 

(a) Prove that Ui € I(Ai n Bi) S (Ui € 1 Ai) n (Ui €I Bi. 

(b) Find an example for which Ui € I(Ai n Bi) # (Ui € IA) n (U; e 
r Bì). 

26. Prove that for all integers a and b there is an integer c such that a 


|c and b|c. 
27. (a) Prove that for every integer n, 15 | n iff 3 | n and 5 | n. 


(b) Prove that it is not true that for every integer n, 60 | n iff 6 | n and 
10 | n. 


3.5 Proofs Involving Disjunctions 


Suppose one of your givens in a proof has the form P V Q. This given tells 
you that either P or Q is true, but it doesn’t tell you which. Thus, there are 
two possibilities that you must take into account. One way to do the proof 


would be to consider these two possibilities in turn. In other words, first 
assume that P is true and use this assumption to prove your goal. Then 
assume Q is true and give another proof that the goal is true. Although you 
don’t know which of these assumptions is correct, the given P V Q tells you 
that one of them must be correct. Whichever one it is, you have shown that 
it implies the goal. Thus, the goal must be true. 

The two possibilities that are considered separately in this type of proof — 
the possibility that P is true and the possibility that Q is true — are called 
cases. The given P V Q justifies the use of these two cases by guaranteeing 
that these cases cover all of the possibilities. Mathematicians say in this 
situation that the cases are exhaustive. Any proof can be broken into two or 
more cases at any time, as long as the cases are exhaustive. 


To use a given of the form P V Q: 


Break your proof into cases. For case 1, assume that P is true and use this 
assumption to prove the goal. For case 2, assume Q is true and give another 
proof of the goal. 


Scratch work 


Before using strategy: 


Givens Goal 


PVvV@Q — 
After using strategy: 
Case |: Givens Goal 
P —_ 
Case 2: Givens Goal 


Q — 
Form of final proof: 
Case 1. P is true. 
[Proof of goal goes here. ] 
Case 2. Q is true. 
[Proof of goal goes here. ] 


Since we know P V Q, these cases cover all the possibilities. Therefore the 
goal must be true. 


Example 3.5.1. Suppose that A, B, and C are sets. Prove that if A © C and 
B S CthenAUBEC. 


Scratch work 


We assume A © C and B S C and prove A U B & C. Writing out the goal 
using logical symbols gives us the following givens and goal: 

Givens Goal 

ACC Vx(x €AUBSxXEC) 

BCC 


To prove the goal we let x be arbitrary, assume x € A U B, and try to 
prove x € C. Thus, we now have a new given x € A U B, which we write as 
x EAV x € B, and our goal is now x € C. 

Givens Goal 
ACC xeC 
BCC 
xEAVxeB 


Because the goal cannot be analyzed any further at this point, we look 
more closely at the givens. The first given will be useful if we ever come 
across an object that is an element of A, since it would allow us to conclude 
immediately that this object must also be an element of C. Similarly, the 
second given will be useful if we come across an element of B. Keeping in 
mind that we should watch for any elements of A or B that might come up, 
we move on to the third given. Because this given has the form P V Q, we 
try proof by cases. For the first case we assume x € A, and for the second 
we assume x € B. In the first case we therefore have the following givens 
and goal: 


Givens Goal 
ACC xeEeC 
BCC 


xXxEA 


We’ve already decided that if we ever come across an element of A, we 
can use the first given to conclude that it is also an element of C. Since we 
now have x € A as a given, we can conclude that x € C, which is our goal. 
The reasoning for the second case is quite similar, using the second given 
instead of the first. 


Solution 


Theorem. Suppose that A, B, and C are sets. If AS C and B & C then AU 
BEC. 


Proof. Suppose A © C and B € C, and let x be an arbitrary element of A U 
B. Then either x € A or x EB. 

Case 1.x € A. Then since A S C, x E€ C. 

Case 2.x € B. Then since B € C, x E€ C. 

Since we know that either x € A or x € B, these cases cover all the 
possibilities, so we can conclude that x € C. Since x was an arbitrary 


element of A U B, this means that A U BSEC. 
O 


Note that the cases in this proof are not exclusive. In other words, it is 
possible for both x € A and x € B to be true, so some values of x might fall 
under both cases. There is nothing wrong with this. The cases in a proof by 
cases must cover all possibilities, but there is no harm in covering some 
possibilities more than once. In other words, the cases must be exhaustive, 
but they need not be exclusive. 

Proof by cases is sometimes also helpful if you are proving a goal of the 
form P V Q. If you can prove P in some cases and Q in others, then as long 
as your cases are exhaustive you can conclude that P V Q is true. This 
method is particularly useful if one of the givens also has the form of a 
disjunction, because then you can use the cases suggested by this given. 


To prove a goal of the form P V Q: 
Break your proof into cases. In each case, either prove P or prove Q. 


Example 3.5.2. Suppose that A, B, and C are sets. Prove that A \ (B\ C) © 
(A\ B) U C. 


Scratch work 


Because the goal is Vx(x € A \ (B \ C) > x € (A \ B) U C), we let x be 
arbitrary, assume x € A \ (B \ C), and try to prove x € (A \ B) U C. Writing 
these statements out in logical symbols gives us: 
Givens Goal 
XEAAAWXEBAX EC) (XEAAXEB)VXEC 


We split the given into two separate givens, x € A and -(x E€ B Ax € ©), 
and since the second is a negated statement we use one of De Morgan’s 
laws to reexpress it as the positive statement x É B V x € C. 


Givens Goal 
xXxEA (XEAAXEB)VXEC 
xXxE€BVXEC 


Now the second given and the goal are both disjunctions, so we’ll try 
considering the two cases x É B and x € C suggested by the second given. 
According to our strategy for proving goals of the form P V Q, if in each 
case we can either prove x € A A x & B or prove x € C, then the proof will 
be complete. For the first case we assume x & B. 

Givens Goal 
xeEA (YE AAXEB)VXEC 
x€éB 


In this case the goal is clearly true, because in fact we can conclude that x 
€ A A x & B. For the second case we assume x € C, and once again the goal 
is clearly true. 


Solution 
Theorem. Suppose that A, B, and C are sets. Then A\ (B \ C) & (A\ B)U C. 


Proof. Suppose x € A \ (B \ C). Then x € A and x È B\ C. Since x É B\ C, 
it follows that either x B or x € C. We will consider these cases 
separately. 

Case 1. x & B. Then since x € A, x € A \ B, so x € (A\ B) UC. 

Case 2. x € C. Then clearly x € (A \ B) U C. 


Since x was an arbitrary element of A \ (B \ C), we can conclude that A\ 
(B\C) S(A\B)UC. 
L 
Sometimes you may find it useful to break a proof into cases even if the 
cases are not suggested by a given of the form P V Q. Any proof can be 
broken into cases at any time, as long as the cases exhaust all of the 
possibilities. 


Example 3.5.3. Prove that for every integer x, the remainder when x? is 
divided by 4 is either 0 or 1. 


Scratch work 
We start by letting x be an arbitrary integer and then try to prove that the 
remainder when x? is divided by 4 is either 0 or 1. 

Givens Goal 

xeZ (x? + 4 has remainder 0) v (x? + 4 has remainder 1) 


Because the goal is a disjunction, breaking the proof into cases seems 
like a likely approach, but there is no given that suggests what cases to use. 
However, trying out a few values for x suggests the right cases: 


x x? quotient of x? +4 remainder of x? +4 
l l 0 l 
2 4 l 0 
3 9 2 l 
4 16 = 0 
5 25 6 l 
6 36 9 0 


It appears that the remainder is 0 when x is even and 1 when x is odd. 
These are the cases we will use. Thus, for case 1 we assume x is even and 
try to prove that the remainder is 0, and for case 2 we assume x is odd and 
prove that the remainder is 1. Because every integer is either even or odd, 
these cases are exhaustive. 


Filling in the definition of even, here are our givens and goal for case 1: 
Givens Goal 
xeZ x? + 4 has remainder 0 
Jk e Z(x = 2k) 


We immediately use the second given and let k stand for some particular 
integer for which x = 2k. Then x? = (2k)* = 4k, so clearly when we divide 
x? by 4 the quotient is k? and the remainder is 0. 
Case 2 is quite similar: 
Givens Goal 
xeZ x? +4 has remainder | 
ak e Z(x = 2k + 1) 


Once again we use the second given immediately and let k stand for an 
integer for which x = 2k + 1. Then x? = (2k + 1)? = 4k? + 4k + 1 = 4(k* + k) 
+ 1, so when x? is divided by 4 the quotient is k? + k and the remainder is 1. 


Solution 


Theorem. For every integer x, the remainder when x? is divided by 4 is 
either 0 or 1. 


Proof. Suppose x is an integer. We consider two cases. 
Case 1. x is even. Then x = 2k for some integer k, so x? = 4k?. Clearly the 
remainder when x? is divided by 4 is 0. 


Case 2. x is odd. Then x = 2k +1 for some integer k, so x? = 4k? +4k +1 = 
4(k? + k) + 1. Clearly in this case the remainder when x? is divided by 4 is 
1. 

L 

Sometimes in a proof of a goal that has the form P V Q it is hard to 
figure out how to break the proof into cases. Here’s a way of doing it that is 
often helpful. Simply assume that P is true in case 1 and assume that it is 
false in case 2. Certainly P is either true or false, so these cases are 
exhaustive. In the first case you have assumed that P is true, so certainly the 
goal P V Q is true. Thus, no further reasoning is needed in case 1. In the 
second case you have assumed that P is false, so the only way the goal P V 
Q could be true is if Q is true. Thus, to complete this case you should try to 
prove Q. 


To prove a goal of the form P V Q: 


If P is true, then clearly the goal P V Q is true, so you only need to worry 
about the case in which P is false. You can complete the proof in this case 


by proving that Q is true. 
Scratch work 


Before using strategy: 


Givens Goal 
— PvỌ 
After using strategy: 
Givens Goal 
_ Q 
=P 


Form of final proof: 
If P is true, then of course P V Q is true. Now suppose P is false. 
[Proof of Q goes here. ] 
Thus, P V Q is true. 


Thus, this strategy for proving P V Q suggests that you transform the 
problem by adding ~P as a new given and changing the goal to Q. It is 
interesting to note that this is exactly the same as the transformation you 
would use if you were proving the goal ~P > Q! This is not really 
surprising, because we already know that the statements P V Q and =P > 
Q are equivalent. But we derived this equivalence before from the truth 
table for the conditional connective, and this truth table may have been hard 
to understand at first. Perhaps the reasoning we’ve given makes this 
equivalence, and therefore the truth table for the conditional connective, 
seem more natural. 

Of course, the roles of P and Q could be reversed in using this strategy. 
Thus, you can also prove P V Q by assuming that Q is false and proving P. 


Example 3.5.4. Prove that for every real number x, if x? > x then either x< 
Oorx>1. 


Scratch work 


Our goal is Vx(x? > x > (x < 0 V x 2 1)), so to get started we let x be an 
arbitrary real number, assume x? > x, and set x <0 V x > 1 as our goal: 
Givens Goal 
xXx x<OVx>1 


According to our strategy, to prove this goal we can either assume x > 0 
and prove x = 1 or assume x < 1 and prove x < 0. The assumption that x is 
positive seems more likely to be useful in reasoning about inequalities, so 
we take the first approach. 


Givens Goal 
5 


e ead x>1 
x >0O 


The proof is now easy. Since x > 0, we can divide the given inequality x? 
> x by x to get the goal x > 1. 


Solution 


Theorem. For every real number x, if x? > x then either x < 0 or x > 1. 


Proof. Suppose x? > x. If x < 0, then of course x < 0 or x > 1. Now suppose x 


> 0. Then we can divide both sides of the inequality x? > x by x to conclude 
that x > 1. Thus, either x< 0 or x > 1. 
L 
The equivalence of P V Q and ~P > Q also suggests a rule of inference 
called disjunctive syllogism for using a given statement of the form P V Q: 


To use a given of the form P V Q: 

If you are also given ~P, or you can prove that P is false, then you can 
use this given to conclude that Q is true. Similarly, if you are given =Q or 
can prove that Q is false, then you can conclude that P is true. 


In fact, this rule is the one we used in our first example of deductive 
reasoning in Chapter 1! 

Once again, we end this section with a proof for you to read without the 
benefit of a preliminary scratch work analysis. 


Theorem 3.5.5. Suppose m and n are integers. If mn is even, then either m 
is even or n is even. 


Proof. Suppose mn is even. Then we can choose an integer k such that mn = 
2k. If m is even then there is nothing more to prove, so suppose m is odd. 
Then m = 2j + 1 for some integer j. Substituting this into the equation mn = 
2k, we get (2j + 1)n = 2k, so 2j n + n = 2k, and therefore n = 2k — 2j n = 2(k 
— jn). Since k —j n is an integer, it follows that n is even. 

L 


Commentary. The overall form of the proof is the following: 


Suppose mn is even. 
If m is even, then clearly either m is even or n is even. Now suppose m 
is not even. Then m is odd. 
[Proof that n is even goes here. ] 
Therefore either m is even or n is even. 
Therefore if mn is even then either m is even or n is even. 


The assumptions that mn is even and m is odd lead, by existential 
instantiation, to the equations mn = 2k and m = 2j + 1. Although the proof 
doesn’t say so explicitly, you are expected to work out for yourself that in 
order to prove that n is even it suffices to find an integer c such that n = 2c. 
Straightforward algebra leads to the equation n = 2(k — j n), so the choice c 
=k- jn works. 


Exercises 
Pp*1. Suppose A, B, and C are sets. Prove that A n (BU C) € (A n B) 
UC. 
Pp2. Suppose A, B, and C are sets. Prove that (A U B)\C CA U (B\ 
C). 


Pp3. Suppose A and B are sets. Prove that A \ (A \ B)=An B. 
Pp4. Suppose A, B, and C are sets. Prove that A\(B \ C) = (A\B)U(A n 
C). 
Pp*¥5. Suppose A n CS Bn CandAU C&B UC. Prove that A © B. 


P,6. 


Pp7. 


Pp*8. 
P,9. 


10. 


11. 


*12. 


13. 
(b) 


(c) 


(d) 


14. 
15. 


*16. 


Recall from Section 1.4 that the symmetric difference of two sets 
A and B is the set A B = (A \ B) U (B\ A) =(A U B)\(A rn B). 
Prove that if A B S Athen BCA. 

Suppose A, B, and C are sets. Prove that A U C © BU CiffA\C 
S B\C. 

Prove that for any sets A and B, P(A) U P(B) E P(A U B). 
Prove that for any sets A and B, if AIA) U A(B) = P(A U B) 
then either A © B or BCA. 

Suppose x and y are real numbers and x = 0. Prove that y + 1/x = 
1 + y/x iff either x = 1 or y= 1. 

Prove that for every real number x, if |x — 3| > 3 then x? > 6x. 
(Hint: According to the definition of |x — 3], if x — 3 = 0 then |x — 
3| =x- 3, and if x- 3 < 0 then |x - 3| = 3 — x. The easiest way to 
use this fact is to break your proof into cases. Assume that x — 3 > 
0 in case 1, and x — 3 < 0 in case 2.) 

Prove that for every real number x, |2x — 6| > x iff |x - 4| > 2. 
(Hint: Read the hint for exercise 11.) 

(a) Prove that for all real numbers a and b, ja | < b iff -b < a < b. 


Prove that for any real number x, -|x| < x < |x|. (Hint: Use part 
(a).) 

Prove that for all real numbers x and y, |x + y| < |x| + |y|. (This is 
called the triangle inequality. One way to prove this is to 
combine parts (a) and (b), but you can also do it by considering a 
number of cases.) 

Prove that for all real numbers x and y, |x + y| > |x| — |y|. (Hint: 
Start with the equation |x| = |(x + y) + (-y)| and then apply the 
triangle inequality to the right-hand side.) 


Prove that for every integer x, x? + x is even. 


Prove that for every integer x, the remainder when x‘ is divided 
by 8 is either 0 or 1. 
Suppose “and Gare nonempty families of sets. 


P,(a) Prove that U( AU Ò = (UZ) u (UM. 
(b) Prove that B u (UA = Une z(B UA). 


(c) Can you discover and prove a similar theorem about (Z U 


17. 


G)? 


Suppose 7 is a nonempty family of sets and B is a set. 


Pp(a) Prove that B U (UA) =U(Z7U {B}). 
(b) Prove that B U (N.F) = (\ye AB U A). 


(c) Can you discover and prove similar theorems about B n (UJ 


18. 


Py19. 
Py *20. 
Pp21. 
Pp22. 


Pp*ř23. 
(a) 
(b) 
Pp*24. 
(a) 
(b) 


and B n (NA? 
Suppose .Z, G, and X are nonempty families of sets and for every 
A € Fand every B € G, A U B € H. Prove that QNH S (F) U (Ø). 
Suppose A and B are sets. Prove that Vx(x EA AB e (xEA © x 


€B)). 


Suppose A, B, and C are sets. Prove that A A B and C are disjoint 


iffAN C=BNC. 


Suppose A, B, and C are sets. Prove that A B © C iff AUC = 
BUC. 

Suppose A, B, and C are sets. Prove that CC AA Biff CQ AU 
BandAn Bn C=@. 

Suppose A, B, and C are sets. 


Prove that A\ C € (A \ B) U (B\C). 

Prove that A C € (A B) U (BC). 

Suppose A, B, and C are sets. 

Prove that (A U B) CS (AC) U (BC). 

Find an example of sets A, B, and C such that (A U B) C#(AA 
C)U (BAC) 


Pp25. Suppose A, B, and C are sets. 


(a) 
(b) 


Prove that (A AC)n (BAC)&S (An BAC. 
Is it always true that (A n B) AC € (AAC) n (BA C)? Give 
either a proof or a counterexample. 


Pp26. Suppose A, B, and C are sets. Consider the sets (A \ B) A C and (A 
A C)\(B A C). Can you prove that either is a subset of the other? 
Justify your conclusions with either proofs or counterexamples. 

*27. Consider the following putative theorem. 


Theorem? For every real number x, if |x — 3| < 3 then 0 < x < 6. 


Is the following proof correct? If so, what proof strategies does it 
use? If not, can it be fixed? Is the theorem correct? 


Proof. Let x be an arbitrary real number, and suppose | x — 3| < 3. We 
consider two cases: 
Case 1. x — 3 = 0. Then |x - 3| = x — 3. Plugging this into the 
assumption that |x — 3| < 3, we get x — 3 < 3, so clearly x < 6. 
Case 2. x -3 < 0. Then |x -3| = 3-x, so the assumption |x -3| < 3 
means that 3 — x < 3. Therefore 3 < 3 + x, so 0 <x. 
Since we have proven both 0 < x and x < 6, we can conclude that 0 
<x <6. 
L 
28. Consider the following putative theorem. 


Theorem? For any sets A, B, and C, if A\B E CandAC thenA n B 
=Ø. 


Is the following proof correct? If so, what proof strategies does it 
use? If not, can it be fixed? Is the theorem correct? 


Proof. Suppose A \ B © C and A C. Since A C, we can choose some x 
such that x € A and x € C. Since x € C and A \ B S C, x EAB. 
Therefore either x € A or x € B. But we already know that x € A, so 
it follows that x € B. Since x € A and x € B, x € A ^ B. Therefore A n 
BZ Ø. 

L] 

*29. Consider the following putative theorem. 

Theorem? Vx € Roy € R(xy* # y - x). 


Is the following proof correct? If so, what proof strategies does it 
use? If not, can it be fixed? Is the theorem correct? 


Proof. Let x be an arbitrary real number. 
Case 1. x = 0. Let y = 1. Then xy? = 0 and y - x = 1 - 0 = 1, so xy? 


FY-xX, 
Case 2. x = 0. Let y = 0. Then xy? = 0 andy - x #-x = 0, so xy? #y 
- x. 


Since these cases are exhaustive, we have shown that Jy € R(xy? # 
y — x). Since x was arbitrary, this shows that Vx € Ray € R(xy* # y - 
x). 
L] 
30. Prove that if VxP(x) > AJxQ(x) then Ax(P(x) > Q(x)). (Hint: 
Remember that P > Q is equivalent to =P V Q.) 
*31. Consider the following putative theorem. 
Theorem? Suppose A, B, and C are sets and A © B U C. Then either 
AS BorA&C. 


Is the following proof correct? If so, what proof strategies does it 
use? If not, can it be fixed? Is the theorem correct? 


Proof. Let x be an arbitrary element of A. Since A © B U G, it follows 
that either x € B or x € C. 
Case 1. x € B. Since x was an arbitrary element of A, it follows that 
Vx € A(x € B), which means that A © B. 
Case 2. x € C. Similarly, since x was an arbitrary element of A, we 
can conclude that A © C. 
Thus, either A E BorA S&C. 
LJ 
Pp32. Suppose A, B, and C are sets and A € B U C. Prove that either A 
©C BorAn Cz. 
33. Prove Ax(P(x) > VyP(y)). (Note: Assume the universe of 
discourse is not the empty set.) 


3.6 Existence and Uniqueness Proofs 


In this section we consider proofs in which the goal has the form A! xP(x). 
Recall that this formula means “there is exactly one x such that P(x),” and 
as we Saw in Section 2.2, it can be thought of as an abbreviation for the 
formula 3 x(P(x) A ~ y(P(y) A y # x)). According to the proof strategies 
discussed in previous sections, we could therefore prove this goal by 
finding a particular value of x for which we could prove both P(x) and ~ 
y(P(y) A y #X). The last part of this proof would involve proving a negated 
statement, but we can reexpress it as an equivalent positive statement: 


7Ay(P(y) A y # x) 
is equivalent to Vy=(P(y) A y # x) (quantifier negation law), 
which is equivalent to Vy(-P(y) V y = x) (De Morgan’s law), 
which is equivalent to Vy(P(y) > y =x) (conditional law). 


Thus, we see that 4! xP(x) could also be written as 4x(P(x) A Vy(P(y) > 
y = x)). In fact, as the next example shows, several other formulas are also 
equivalent to 4!xP(x), and they suggest other approaches to proving goals 
of this form. 


Example 3.6.1. Prove that the following formulas are all equivalent: 


1. Ax(P(x) A Vy(P(y) > y =x)). 
2. AxVy(P(y)- y =x). 
3. AxP(x) A VyV z((P (y) A P(z)) > y =z). 


Scratch work 


If we prove directly that each of these statements is equivalent to each of 
the others, then we will have three biconditionals to prove: statement 1 iff 
statement 2, statement 1 iff statement 3, and statement 2 iff statement 3. If 
we prove each biconditional by the methods of Section 3.4, then each will 
involve two conditional proofs, so we will need a total of six conditional 
proofs. Fortunately, there is an easier way. We will prove that statement 1 
implies statement 2, statement 2 implies statement 3, and statement 3 
implies state-ment 1 — just three conditionals. Although we will not give a 
separate proof that statement 2 implies statement 1, it will follow from the 


fact that statement 2 implies statement 3 and statement 3 implies statement 
1. Similarly, the other two conditionals follow from the three we will prove. 
Mathematicians almost always use some such shortcut when proving that 
several statements are all equivalent. Because we’ll be proving three 
conditional statements, our proof will have three parts, which we will label 
1 > 2,2 > 3,and3 > 1. We’ll need to work out our strategy for the three 
parts separately. 
1 > 2. We assume statement 1 and prove statement 2. Because statement 
1 starts with an existential quantifier, we choose a name, say Xp, for some 
object for which both P(x9) and Vy(P(y) > y = Xp) are true. Thus, we now 
have the following situation: 
Givens Goal 
P (xo) JxYy(P (y) e y =x) 
Yy(P(y) > y = Xo) 


Our goal also starts with an existential quantifier, so to prove it we should 
try to find a value of x that makes the rest of the statement come out true. 
Of course, the obvious choice is x = xọ. Plugging in xọ for x, we see that we 


must now prove Vy(P(y) e y = Xo). We let y be arbitrary and prove both 


directions of the biconditional. The — direction is clear by the second 
given. For the < direction, suppose y = Xp. We also have P(x) as a given, 


and plugging in y for xọ in this given we get P(y). 

2 > 3. Statement 2 is an existential statement, so we let xọ be some 
object such that Vy(P(y) e y = xo). The goal, statement 3, is a conjunction, 
SO we treat it as two separate goals. 

Givens Goals 
Yy(P (y) = y = Xo) dx P(x) 
VyVz((P(y) A P(z)) > y =2) 

To prove the first goal we must choose a value for x, and of course the 
obvious value is x = Xp again. Thus, we must prove P(x,). The natural way 
to use our only given is to plug in something for y; and to prove the goal 
P(X), the obvious thing to plug in is xp. This gives us P(xọ) © Xp = Xo. Of 
course, Xo = Xo is true, so by the < direction of the biconditional, we get 
P(X). 


For the second goal, we let y and z be arbitrary, assume P(y) and P(z), 
and try to prove y = z. 
Givens Goal 
Vy(P(y) = y = x0) y=Z 
P(y) 
P(z) 
Plugging in each of y and z in the first given we get P(Y) e y = Xp and 
P(z) z = xo. Since we’ve assumed P(y) and P(z), this time we use the > 
directions of these biconditionals to conclude that y = xg and z = Xp. Our 
goal y = z clearly follows. 


3 > 1. Because statement 3 is a conjunction, we treat it as two separate 
givens. The first is an existential statement, so we let xọ stand for some 


object such that P(x9) is true. To prove statement 1 we again let x = Xp, so 
we have this situation: 
Givens Goal 
P (xo) P(xo) AYy(P (y) > y = x0) 
VyVz((P(y) A P(z)) > y =2Z) 

We already know the first half of the goal, so we only need to prove the 
sec-ond. For this we let y be arbitrary, assume P(y), and make y = Xp our 
goal. 

Givens Goal 
P(xo) y = X0 
Vy¥z((P(y) A P(z)) > y = 2) 
P(y) 
But now we know both P(y) and P(x), so the goal y = xp follows from the 
second given. 


Solution 
Theorem. The following are equivalent: 
1. Ax(P(x) A Vy(P(y) > y =X)). 
2. AxVy(P(y)< y =X). 
3. AxP(x) A VyV z((P (y) A P(z)) > y =2Z). 


Proof. 1 > 2. By statement 1, we can let x9 be some object such that P(x) 
and Vy(P(y) > y =X). To prove statement 2 we will show that Vy(P(y) e y 
= Xo). Let y be arbitrary. We already know the > direction of the 
biconditional. For the - direction, suppose y = xọ. Then since we know 
P(xo), we can conclude P(y). 


2 > 3. By statement 2, choose xg such that Vy(P(y) ~ y = Xo). Then, in 
particular, P(x) © Xo = Xo, and since clearly xg = Xo, it follows that P(xp) is 
true. Thus, 4xP(x). To prove the second half of statement 3, let y and z be 
arbitrary and suppose P(y) and P(z). Then by our choice of xq (as something 
for which Vy(P(y) e y =X) is true), it follows that y = xq and z = Xo, so y = 
Z. 

3 > 1. By the first half of statement 3, let xọ be some object such that 
P(xo). Statement 1 will follow if we can show that Vy(P(y) > y = Xo), so 
suppose P(y). Since we now have both P(x,) and P(y), by the second half of 
statement 3 we can conclude that y = xX, as required. 

L 


Because all three of the statements in the theorem are equivalent to A! 
xP(x), we can prove a goal of this form by proving any of the three 
statements in the theorem. Probably the most common technique for 
proving a goal of the form 4! xP(x) is to prove statement 3 of the theorem. 


To prove a goal of the form 4! xP(x): 

Prove 4xP(x) and VyWz((P (y) A P(z)) > y = z). The first of these goals 
shows that there exists an x such that P(x) is true, and the second shows that 
it is unique. The two parts of the proof are therefore sometimes labeled 
existence and uniqueness. Each part is proven using strategies discussed 
earlier. 


Form of final proof: 


Existence: [Proof of 4xP(x) goes here.] 
Uniqueness: [Proof of VyWz((P (y) A P(z)) > y =z) goes here. ] 


Example 3.6.2. Prove that there is a unique set A such that for every set B, 
AU B=B. 


Scratch work 


Our goal is 4! AP(A), where P(A) is the statement VB(A U B = B). Accord- 
ing to our strategy, we can prove this by proving existence and uniqueness 
separately. For the existence half of the proof we must prove SAP(A), so we 
try to find a value of A that makes P(A) true. There is no formula for finding 
this set A, but if you think about what the statement P(A) means, you should 
realize that the right choice is A = ©. Plugging this value in for A, we see 
that to complete the existence half of the proof we must show that VB(© U 
B = B). This is clearly true. (If you’re not sure of this, work out the proof!) 
For the uniqueness half of the proof we prove VCVD((P (C) A P(D)) > 
C = D). To do this, we let C and D be arbitrary, assume P(C) and P(D), and 
prove C = D. Writing out what the statements P(C) and P(D) mean, we 
have the following givens and goal: 
Givens Goal 
VB(CUB=B) C=D 
YB(D U B = B) 


To use the givens, we should try to find something to plug in for B in 
each of them. There is a clever choice that makes the rest of the proof easy: 
we plug in D for B in the first given, and C for B in the second. This gives 
us CU D = D and D U C = C. But clearly C U D = D u C. (If you don’t see 
why, prove it!) The goal C = D follows immediately. 


Solution 
Theorem. There is a unique set A such that for every set B, A U B=B. 
Proof. Existence: Clearly VB(Ø U B = B), so Ø has the required property. 


Uniqueness: Suppose VB(C U B = B) and VB(D U B = B). Applying the 
first of these assumptions to D we see that C U D = D, and applying the 
second to C we get D U C=C. But clearly C U D = D U C, so C = D. 

L] 


Sometimes a statement of the form Jd! xP(x) is proven by proving 
statement 1 from Example 3.6.1. This leads to the following proof strategy. 


To prove a goal of the form 3! xP(x): 


Prove 4x(P(x) A Vy(P(y) > y = x)), using strategies from previous 
sections. 


Example 3.6.3. Prove that for every real number x, if x = 2 then there is a 
unique real number y such that 2y/(y + 1) =x. 


Scratch work 


Our goal is Vx(x #2 > A! y(2y/(y + 1) = x)). We therefore let x be arbitrary, 
assume x # 2, and prove J! y(2y/(y + 1) = x). According to the preceding 
strategy, we can prove this goal by proving the equivalent statement 


, 2 27 \ 
ay ( — =x A ve ( —=x>7Z= y)). 
y+] z+1 l 


We start by trying to find a value of y that will make the equation 2y/(y + 1) 
= x come out true. In other words, we solve this equation for y: 


a , 
z\ x 
— =x > 2 = xy +1) > y2- x)=x> y=- : 
yt+1 2—x 


Note that we have x # 2 as a given, so the division by 2 — x in the last step 
makes sense. Of course, these steps will not appear in the proof. We simply 
let y = x2 — x) and try to prove both 2y/(y + 1) = x and Vz(2z(z + 1)=x > 
Z=Yy). 


Givens Goals 
a 2y 
x#2 =x 
x y+ 
y = f V7 
i aa 2z 
4—4 vz( =x>z=7) 

2+ 


The first goal is easy to verify by simply plugging in x/(2 — x) for y. For 
the second, we let z be arbitrary, assume 2z/(z + 1) = x, and prove z = y: 
Existence and Uniqueness Proofs 
Givens Goal 
x £2 Z=y 


We can show that z = y now by solving for z in the third given: 


ie 
Zz . X 
=x > 27=x(24+1)3 7(2-x)=x>ə>7= 


z+ 1 2-x 


=y 


Note that the steps we used here are exactly the same as the steps we 
used earlier in solving for y. This is a common pattern in existence and 
uniqueness proofs. Although the scratch work for figuring out an existence 
proof should not appear in the proof, this scratch work, or reasoning similar 
to it, can sometimes be used to prove that the object shown to exist is 
unique. 


Solution 


Theorem. For every real number x, if x # 2 then there is a unique real 
number y such that 2y/(y + 1) =x. 


Proof. Let x be an arbitrary real number, and suppose x # 2. Let y = x⁄(2-x), 
which is defined since x # 2. Then 


2y 3x s Ten 2x 


yt! +1 | 2 


To see that this solution is unique, suppose 2z/(z + 1) = x. Then 2z = x(z + 
1), so z(2 — x) = x. Since x = 2 we can divide both sides by 2 — x to get z = 
x2 - x) =y. 

L] 


The theorem in Example 3.6.1 can also be used to formulate strategies 
for using givens of the form J!xP(x). Once again, statement 3 of the 
theorem is the one used most often. 


To use a given of the form 3! xP(x): 


Treat this as two given statements, 4xP(x) and VyVz((P(y) A P(z)) > y= 
z). To use the first statement you should probably choose a name, say Xo, to 


stand for some object such that P(x,) is true. The second tells you that if 


you ever come across two objects y and z such that P(y) and P(z) are both 
true, you can conclude that y = z. 


Example 3.6.4. Suppose A, B, and C are sets, A and B are not disjoint, A 
and C are not disjoint, and A has exactly one element. Prove that B and C 
are not disjoint. 


Scratch work 


Givens Goal 


ANBFS BNC#2@ 
ANCFS 
d!x(x € A) 


We treat the last given as two separate givens, as suggested by our 
strategy. Writing out the meanings of the other givens and the goal, we have 
the following situation: 


Givens Goal 


dx(x € AAX € B) dx(x Ee BAx EC) 
dx(x €AAX EC) 
dx(x € A) 


To prove the goal, we must find something that is an element of both B 
and C. To do this, we turn to the givens. The first given tells us that we can 
choose a name, say b, for something such that b € A and b € B. Similarly, 
by the second given we can let c be something such that c € A and c € C. At 
this point the third given is redundant. We already know that there’s 
something in A, because in fact we already know that b € A andc € A. We 
may as well skip to the last given, which says that if we ever come across 
two objects that are elements of A, we can conclude that they are equal. But 
as we have just observed, we know that b € A and c € A! We can therefore 
conclude that b = c. Since b € B and b = c € C, we have found something 
that is an element of both B and C, as required to prove the goal. 


Solution 


Theorem. Suppose A, B, and C are sets, A and B are not disjoint, A and C 
are not disjoint, and A has exactly one element. Then B and C are not 
disjoint. 

Proof. Since A and B are not disjoint, we can let b be something such that b 
€ A and b € B. Similarly, since A and C are not disjoint, there is some 


object c such that c € A and c € C. Since A has only one element, we must 
have b = c. Thus b = c € Bn Cand therefore B and C are not disjoint. 
L 


Exercises 
*1. Prove that for every real number x there is a unique real number y 
such that x*y = x - y. 


2. Prove that there is a unique real number x such that for every real 
number y, xy + x — 4 = 4y. 


3. Prove that for every real number x, if x = 0 and x = 1 then there is 
a unique real number y such that y/x = y - x. 
*4, Prove that for every real number x, if x = 0 then there is a unique 
real number y such that for every real number z, zy = Z/x. 


5. Recall that if Zis a family of sets, then UZ = {x| JA(A E FA x 
€ A)}. Suppose we define a new set LU!.7 by the formula U!.7= 
{x | FIA(A € FA x E A)}. 
(a) Prove that for any family of sets Z; U! 7S UF 
(b) A family of sets Fis said to be pairwise disjoint if every pair of 
distinct elements of Zare disjoint; that is, VA € AVB € AA # 
B> An B= Ø). Prove that for any family of sets Æ U!7= UF 
iff Fis pairwise disjoint. 
Py*6. Let U be any set. 
(a) Prove that there is a unique A € “(U) such that for every B € 


YAU), AU B=B. 
(b) Prove that there is a unique A € “(U) such that for every B € 
YAU), AU B=A. 


P,)*7. Let U be any set. 


(a) 
(b) 


Prove that there is a unique A € “(U) such that for every B € 
YAU), An B=B. 
Prove that there is a unique A € “(U) such that for every B € 
YAU), An B=A. 


Py*8. Let U be any set. 


(a) 
(b) 


P9. 


(a) 


(b) 


(c) 


(d) 


Pp10. 


p*11. 


12. 


Prove that for every A € (U) there is a unique B € A(U) such 
that for every C € AU), C\A=CnB. 
Prove that for every A € A(U) there is a unique B € A(U) such 
that for every C € AU), CN A=C\B. 


Recall that you showed in exercise 14 of Section 1.4 that 
symmetric difference is associative; in other words, for all sets A, 
B, and C, AA (B A C) = (A A B) AC. You may also find it useful 
in this problem to note that symmetric difference is clearly 
commutative; in other words, for all sets A and B, AA B=BAA. 


Prove that there is a unique identity element for symmetric differ- 
ence. In other words, there is a unique set X such that for every 
set A, AA X =A. 

Prove that every set has a unique inverse for the operation of 
symmetric difference. In other words, for every set A there is a 
unique set B such that A A B = X, where X is the identity element 
from part (a). 

Prove that for any sets A and B there is a unique set C such that A 
AC=B. 

Prove that for every set A there is a unique set B € A such that 
for every set C S A, BAC=A\C. 


Suppose A is a set, and for every family of sets 7; if U.7= A then 
A E J. Prove that A has exactly one element. 

Suppose .F is a family of sets that has the property that for every 
GS F, UG EZ Prove that there is a unique set A such that A € 
Fand V B € ABCA). 


(a) Suppose P(x) is a statement with a free variable x. Find a 
formula, using the logical symbols we have studied, that 


means “there are exactly two values of x for which P(x) is 
true.” 


(b) Based on your answer to part (a), design a proof strategy for 
proving a statement of the form “there are exactly two values of 
x for which P(x) is true.” 

(c) Prove that there are exactly two solutions to the equation x? = x°. 


13. (a) Prove that there is a unique real number c such that there is 
a unique real number x such that x? + 3x + c = 0. (In other 
words, there is a unique real number c such that the equation 
x? + 3x + c = 0 has exactly one solution.) 


(b) Show that it is not the case that there is a unique real number x 
such that there is a unique real number c such that x? +3x +c = 0. 
(Hint: You should be able to prove that for every real number x 
there is a unique real number c such that x? + 3x + c = 0.) 


3.7 More Examples of Proofs 


So far, most of our proofs have involved fairly straightforward applications 
of the proof techniques we’ve discussed. We end this chapter with a few 
examples of somewhat more difficult proofs. These proofs use the 
techniques of this chapter, but for various reasons they’re a little harder than 
most of our earlier proofs. Some are simply longer, involving the 
application of more proof strategies. Some require clever choices of which 
strategies to use. In some cases it’s clear what strategy to use, but some 
insight is required to see exactly how to use it. Our earlier examples, which 
were intended only to illustrate and clarify the proof techniques, may have 
made proof writing seem mechanical and dull. We hope that by studying 
these more difficult examples you will begin to see that mathematical 
reasoning can also be surprising and beautiful. 

Some proof techniques are particularly difficult to apply. For example, 
when you’re proving a goal of the form 4xP(x), the obvious way to proceed 
is to try to find a value of x that makes the statement P(x) true, but 
sometimes it will not be obvious how to find that value of x. Using a given 
of the form VxP(x) is similar. You’ll probably want to plug in a particular 
value for x, but to com-plete the proof you may have to make a clever 


choice of what to plug in. Proofs that must be broken down into cases are 
also sometimes difficult to figure out. It is sometimes hard to know when to 
use cases and what cases to use. 

We begin by looking again at the proofs from the introduction. Some 
aspects of these proofs probably seemed somewhat mysterious when you 
read them in the introduction. See if they make more sense to you now that 
you have a better understanding of how proofs are constructed. We will 
present each proof exactly as it appeared in the introduction and then follow 
it with a commentary discussing the proof techniques used. 


Theorem 3.7.1. Suppose n is an integer larger than 1 and n is not prime. 
Then 2" — lis not prime. 


Proof. Since n is not prime, there are positive integers a and b such that a < 


b — 
n, b <n, and n = ab. Let x = 22-1 and y = 1 + 2? + 2? fe ecards J 
Then 
xy= (Pa ot + 2) +2% +... 4 267Db) 
=2. (142 +2” +. $2) — (1H 2? 2 He. 200) 
= (2° 42% 4.9% 4... 4.2%) — (14-29 4 2 4... 4 2) 


— gab | 


—_— nr l 


Since b < n, we can conclude that x = 2? — 1 < 2” — 1. Also, since ab = n 

> a, it follows that b > 1. Therefore, x = 2°- 1 > 2! - 1 = 1, so y < xy = 2” 

— 1. Thus, we have shown that 2” — 1 can be written as the product of two 

positive integers x and y, both of which are smaller than 2” —1, so 2” — 1 is 
not prime. 

L 


Commentary. We are given that n is not prime, and we must prove that 2” — 
1 is not prime. Both of these are negative statements, but fortunately it is 
easy to reexpress them as positive statements. To say that an integer larger 
than 1 is not prime means that it can be written as a product of two smaller 
positive integers. Thus, the hypothesis that n is not prime means Ja € Z*3b 


E€EZ*™(ab=n^a<n Ab < n), and what we must prove is that 2”- 1 is not 


prime, which means Jx € Z*Jy € Z*(xy = 2" -1A x < 2"-1N y < 2" -1), 
In the second sentence of the proof we apply existential instantiation to the 
hypothesis that n is not prime, and the rest of the proof is devoted to 
exhibiting numbers x and y with the properties required to prove that 2”- 1 
is not prime. 

As usual in proofs of existential statements, the proof doesn’t explain 
how the values of x and y were chosen, it simply demonstrates that these 
values work. After the values of x and y have been given, the goal 
remaining to be proven is xy=2”-1Ax<2"-1Ay< 2" -— 1. Of course, 
this is treated as three separate goals, which are proven one at a time. The 
proofs of these three goals involve only elementary algebra. 

One of the attractive features of this proof is the calculation used to show 
that xy = 2" — 1. The formulas for x and y are somewhat complicated, and at 
first their product looks even more complicated. It is a pleasant surprise 
when most of the terms in this product cancel and, as if by magic, the 
answer 2” — 1 appears. Of course, we can see with hindsight that it was this 
calculation that motivated the choice of x and y. There is, however, one 
aspect of this calculation that may bother you. The use of “: - - “in the 
formulas indicates that the proof depends on a pattern in the calculation that 
is not being spelled out. We’ll give a more rigorous proof that xy = 2” — 1 in 
Chapter 6, after we have intro-duced the method of proof by mathematical 
induction (see Theorem 6.5.2). 


Theorem 3.7.2. There are infinitely many prime numbers. 


Proof. Suppose there are only finitely many prime numbers. Let p,,p5,..., 
D, be a list of all prime numbers. Let m = p4p» --- p, + 1. Note that m is not 
divisible by p4, since dividing m by p, gives a quotient of p ps ' ' * p, anda 
remainder of 1. Similarly, m is not divisible by any of pp, P3, .. . , Pa 

We now use the fact that every integer larger than 1 is either prime or can 
be written as a product of primes. (We’ll see a proof of this fact in Chapter 
6 — see Theorem 6.4.2.) Clearly m is larger than 1, so m is either prime or a 
product of primes. Suppose first that m is prime. Note that m is larger than 
all of the numbers in the list p4, po, . .. , Pn, So we’ve found a prime number 


not in this list. But this contradicts our assumption that this was a list of all 
prime numbers. 


Now suppose m is a product of primes. Let g be one of the primes in this 
product. Then m is divisible by q. But we’ve already seen that m is not 
divisible by any of the numbers in the list p4, po, . . . , Pa, SO once again we 


have a contradiction with the assumption that this list included all prime 
numbers. 
Since the assumption that there are finitely many prime numbers has led 
to a contradiction, there must be infinitely many prime numbers. 
L 


Commentary. Because infinite means not finite, the statement of the 
theorem might be considered to be a negative statement. It is therefore not 
surprising that the proof proceeds by contradiction. The assumption that 
there are finitely many primes means that there exists a natural number n 
such that there are n primes, and the statement that there are n primes means 
that there is a list of distinct numbers p4, P», . . . , p, such that every number 


in the list is prime, and there are no primes that are not in the list. Thus, the 
second sentence of the proof applies existential instantiation to introduce 
the numbers n and p4, P>, . . . , P, into the proof. At this point in the proof 


we have the following situation: 
Givens Goal 
Pi» P2. ---, Pn are all prime Contradiction 
—3q (q is prime ^q ¢ {P1, P2... Pn}) 


The second given could be reexpressed as a positive statement, but since 
we are doing a proof by contradiction, another reasonable approach would 
be to try to reach a contradiction by proving that Jq(q is prime A q É {p}, 
Do, - +--+» Dntt). This is the strategy used in the proof. Thus, the goal for the 
rest of the proof is to show that there is a prime number not in the list p4, po, 

..,P,—an “unlisted prime.” 


Because our goal is now an existential statement, it is not surprising that 
the next step in the proof is to introduce the new number m, without any 
explanation of how m was chosen. What is surprising is that m may or may 
not be the unlisted prime we are looking for. The problem is that m might 
not be prime. All we can be sure of is that m is either prime or a product of 
primes. Because this statement is a disjunction, it suggests proof by cases, 
and this is the method used in the rest of the proof. Although the cases are 


not explicitly labeled as cases in the proof, it is important to realize that the 
rest of the proof has the form of a proof by cases. In case 1 we assume that 
m is prime, and in case 2 we assume that it is a product of primes. In both 
cases we are able to produce an unlisted prime as required to complete the 
proof. 


Our next proof uses factorial notation. Recall that for any positive integer 
n, n factorial is the number n !=1:2:-3---n. 


Theorem 3.7.3. For every positive integer n, there is a sequence of n 
consecutive positive integers containing no primes. 


Proof. Suppose n is a positive integer. Let x = (n + 1)! +2. We will show 
that none of the numbers x, x + 1,x+2,...,x+(n- 1) is prime. Since this 
is a sequence of n consecutive positive integers, this will prove the theorem. 


To see that x is not prime, note that 


X=1-2-3-4---(n+1)4+2 
=2-(1-3-4---(n+1)4+ 1). 


Thus, x can be written as a product of two smaller positive integers, so x is 
not prime. 


Similarly, we have 


x+1=1-2-3-4---(n4+1)4+3 
=3-(1-2-4---(n+1)4+1), 


so x + 1 is also not prime. In general, consider any number x + i, where 0 <i 
<n-1.Then we have 


x+iz1-2-3-4---(n+1)+(4+2) 
= (i +2)-(1-2-3---(@4+1)-(@+3)---(n+1)4+ 1), 


so x + i is not prime. 
T 


Commentary. A sequence of n consecutive positive integers is a sequence 
of the form x, x + 1,x+2,...,xX+ (n - 1), where x is a positive integer. 
Thus, the logical form of the statement to be proven is Vn > 03x > OViI(O < 
i<n-1 > x+ iis not prime), where all variables range over the integers. 
The overall plan of the proof is exactly what one would expect for a proof 


of a statement of this form: we let n > 0 be arbitrary, specify a value for x, 
let i be arbitrary, and then assume that 0 < i < n — 1 and prove that x + i is 
not prime. As in the proof of Theorem 3.7.1, to prove that x + i is not prime 
we show how to write it as a product of two smaller positive integers. 

Before the demonstration that x + i is not prime, where i is an arbitrary 
integer between 0 and n — 1, the proof includes verifications that x and x + 1 
are not prime. These are completely unnecessary and are only included to 
make the proof easier to read. 


Example 3.7.4. Prove that there is a unique real number m with the 
following two properties: 


1. For every real number x, x? + 2x + 3 > m. 


2. Ify is any real number with the property that for every real number x, x? 


+2x+32y, thenm=2y. 
Scratch work 


It will be convenient to have a name for property 1. We will say that m is a 
lower bound for the expression x? + 2 x + 3 if property 1 holds; that is, if for 
every real number x, x? + 2x + 3 > m. Property 2 then says that if y is any 
lower bound for x2 + 2x + 3, then m > y. In other words, no lower bound can 
be larger than m, so m is the greatest lower bound. (We will have more to 
say about lower bounds and greatest lower bounds in Section 4.4 of Chapter 
4.) 

We will have to prove both existence and uniqueness of the number m. 
For the existence half of the proof, the hardest part is coming up with the 
right value for m. We can get a hint at how to choose m by completing the 
square: 


x? +2x+3=x+2x+1+2=(x+1}?+2. 


Since (x + 1)? cannot be negative, for every real number x we will have x? + 
2x + 3 = (x + 1)? + 2 > 2, so m = 2 will work in property 1 — in other words, 
2 is a lower bound for x? + 2x + 3. Of course, any smaller number would 
also be a lower bound, but property 2 requires that m must be the greatest 
lower bound, so m can’t be smaller than 2. Perhaps m = 2 is the right 
choice. Let’s see if we can prove property 2 with this choice of m. 


To prove that property 2 holds with m = 2, we must prove Vy[Wx(x2 + 2x 
+ 32 y) > 2> y]. The obvious way to proceed is to let y be arbitrary, 
assume Wx(x* + 2 x + 3 > y), and then prove 2 > y, which gives us the 
following situation: 
Givens Goal 
Vx(x? + 2x +3 > y) 2>y 


The natural way to use our given is to plug something in for x. Looking at 
the goal, we see that if only there were a value of x for which x? + 2x + 3 = 
2, then plugging in this value of x in the given would lead directly to the 
goal. Solving the equation x? + 2x + 3 = 2, we find that setting x = -1 will 
complete the proof. 

We still have to prove uniqueness of m. For this we should assume that 
m; and m, are two numbers that have properties 1 and 2, and then prove m, 
= my. This gives us the following givens and goal: 

Givens Goal 
Wx(x2 +2x¥+3>m}) mı = mp2 
Vxu(x- + 2x +3 > m2) 
Vy[Wx(x?2 + 2x +3 > y) > mı = y] 
Vy[Wx(x? +2x+3> y)—> m >y] 
We should probably apply universal instantiation to one or more of the 
givens, but which ones, and what values should we plug in? The key 


observation is that the first two givens suggest that it would be useful to 
plug in m, or m, for y in the third and fourth givens. In fact we will set y = 


m, in the third given and y = m; in the fourth. (You might want to compare 


this to the strategy we used for the uniqueness proof in Example 3.6.2.) 
This gives us m; = m, and m, = mj, and the goal m, = m, follows. 


Solution 


Theorem. There is a unique real number m with the following two 
properties: 
1. For every real number x, x? +2 x +3 >m. 


2. If y is any real number with the property that for every real 
number x, x? + 2 x + 3 > y, then m 2 y. 


Proof. Existence: Let m = 2. To prove property 1, let x be an arbitrary real 
number. Then 


x? +2x+3=(x+1)?+22>2=m, 


as required. This shows that 2 is a lower bound for x? + 2x + 3. 
For property 2, let y be an arbitrary number with the property that for 
every x,x* + 2 x + 3 > y. In particular, setting x = —1 we find that 


y <(-1) + 2(-1)+3=2=m. 


Since y was arbitrary, this proves property 2. 

Uniqueness: Suppose m, and m, both have properties 1 and 2. In other 
words, m, and m, are both lower bounds for x? + 2x + 3, and also if y is any 
lower bound, then m; = y and m, => y. Applying this last fact to both y = m, 
and y = m, we get m, = m, and m, = m,, SO m4 = Mp. 

L 


For readers who are familiar with the definition of limits from calculus, 
we give one more example, showing how proofs involving limits can be 
worked out using the techniques in this chapter. Readers who are not 
familiar with this definition should skip this example. 


Example 3.7.5. Show that 


Scratch work 


According to the definition of limits, our goal means that for every positive 
number there is a positive number 6 such that if x is any number such that 0 
< |x — 3| < 6, then |(2x? - 5 x - 3)/(x - 3)- 7|<. Translating this into logical 
symbols, we have 


ok i = 2x? —5x —3 
Ve > 038 e OWx (0 <= |x — 3| <= ô —> : =a = 7 Z E ) « 
x-i l 


We therefore start by letting be an arbitrary positive number and then try to 
find a positive number 6 for which we can prove 


j > 
Yx (0 < |x -3| < ô > 


The scratch work involved in finding 6 will not appear in the proof, of 
course. In the final proof we’ll just write “Let 6 = (some positive number)” 
and then proceed to prove 


K-33 


i 2x? — 5x —3 
Wx (0 < |x —3| <5 > |———_ - 7 < e). 


Before working out the value of 6, let’s figure out what the rest of the proof 
will look like. Based on the form of the goal at this point, we should 
proceed by letting x be arbitrary, assuming 0 < |x — 3] < 6, and then proving 
(2x? - 5x - 3)/(x - 3) - 7| < €. Thus, the entire proof will have the 
following form: 


Let € be an arbitrary positive number. 
Let 6 = (some positive number). 
Let x be arbitrary. 
Suppose 0 < |x — 3| < 6. 
[Proof of |(2x* — 5x - 3)/(x - 3) — 7| < goes here.] 
Therefore 0 < |x — 3| < 6 > |(2x? - 5x - 3)(x - 3) - 7| < €. 
Since x was arbitrary, we can conclude that Vx(0 < |x - 3| < 6 > |(2x? - 
5x — 3)/(% — 3) - 7| <). 

Therefore 36 > OWx(0 < |x -3| < 6 > |(2x°-5x -3)/(x-3)-7| <) €. Since 
€ was arbitrary, it follows that VE> 045 > OWx(0 < |x - 3| < & > |(2x? - 5x 
-= Bx - 3) - 7| <) €. 

Two steps remain to be worked out. We must decide what value to assign 
to 6, and we must fill in the proof of |(2x* - 5 x - 3)/(x - 3)- 7| < €. We’ll 
work on the second of these steps first, and in the course of working out this 
step it will become clear what value we should use for 6. The givens and 
goal for this second step are as follows: 

Givens Goal 


; 
- 2x" — 5x — 3 

E> 0 — ‘m | € 

; oT xX-—3 

5 = (some positive number) 


O<|x—3|/ <8 


First of all, note that we have 0 < |x — 3] as a given, so x = 3 and therefore 
the fraction (2x? — 5x — 3)/(x - 3) is defined. Factoring the numerator, we 
find that 

2x? —5x—3 
x-3 7 


(2x + 1)(x — 3) -7 
x-3 


PF 
y 


= |2x +1 -7| = |2x — 6| = 2|x — 3}. 


Now we also have as a given that |x — 3| < 6, so 2|x — 3| < 26. Combining 
this with the previous equation, we get |(2x? — 5x - 3)/(x - 3) - 7| < 26, and 
our goal is |(2x* — 5x — 3)/(x — 3) — 7| < €. Thus, if we chose 6 so that 26 = 
€, we’d be done. In other words, we should let 6 = €/2. Note that since € > 
0, this is a positive number, as required. 


Solution 


Theorem. jim colli dati 7. 

x3 x-3 
Proof. Suppose € > 0. Let 6 = 72, which is also clearly positive. Let x be an 
arbitrary real number, and suppose that 0 <| x — 3|< 6. Then 


(2x + 1)(x — 3) 


> 7| =|2x+1-7] 
X -2 


En 
= |2x — 6| = 2|x — 3] <25 = 2(=) =e. 


Exercises 


Pp*1. Suppose 7 is a family of sets. Prove that there is a unique set A 
that has the following two properties: 
(a) FS AA). 
(b) V B(ZFE AB) > ASB). 
(Hint: First try an example. Let F= {{1, 2, 3}, {2, 3, 4}, {3, 4, 
5}}. Can you find the set A that has properties (a) and (b)?) 


2. 


(a) 
(b) 


Prove that there is a unique positive real number m that has the 
following two properties: 


For every positive real number *. = < ™. 
If y is any positive real number with the property that for every 
positive real number x. =z < Y» then m < y. 


Pp. Suppose A and B are sets. What can you prove about “(A\B) \ 


(P(A\ YA(B))? (No, it’s not equal to ©. Try some examples and 
see what you get.) 


Pp4. Suppose that A, B, and C are sets. Prove that the following 


(a) 
(b) 


statements are equivalent: 

(AA C)n (BAC)= ©. 

An BS cC E&A U RB. (Note: This is a shorthand way of saying 
that An BS CandC SAUB. 


(QAACSAAB. 


I. 


(b) 
(c) 


Suppose {A; | i € I} is a family of sets. Prove that if P(U; er Aj) 
S Ui er P(A), then there is some i € I such that Vj € I(A; S A). 
Suppose is a nonempty family of sets. Let I = U.Z and J = UF 


Suppose also that J # ©, and notice that it follows that for every 
X € F, X # ©, and also that I # Ø. Finally, suppose that {A; | i € 


T} is an indexed family of sets. 

Prove that U; e r Ai = Ux e RU; e x Ai). 

Prove that U; -;A; = Ux e AUi cx Aj)- 

Prove that U; e zA; S Ux e AUi e x Aj). Is it always the case that 
Ui e z Ai = Ux e a Ui e x Ai)? Give either a proof or a 
counterexample to justify your answer. 

Discover and prove a theorem relating U; e z Aj and Ux e ay, pe 
A). 


Prove that lim, „> 2552 = 12. 


Prove that if lim, i c Kx) = L and L > 0, then there is some 
number 6 > 0 such that for all x, if 0 < | x — c | < 6 then f(x) > 0. 


9. Prove that if lim, > , f(x) = L then lim, > , 7f(x) = 7L. 
*10. Consider the following putative theorem. 


Theorem? There are irrational numbers a and b such that a? is 
rational. 

Is the following proof correct? If so, what proof strategies does it 
use? If not, can it be fixed? Is the theorem correct? (Note: The proof 
uses the fact that \/2 is irrational, which we’ll prove in Chapter 6 — 
see Theorem 6.4.5.) 


Proof. Either ,/3¥? is rational or it’s irrational. 
Case 1. Ja? is rational. Let a = b = /2. Then a and b are irrational, 


and a? = ./>”*. which we are assuming in this case is rational. 


— 


Case 2. 5%? is irrational. Let , — 5%? and p = V2. Then a is 


V 


Vv 
irrational by assumption, and we know that b is also irrational. Also, 


5 


\ v f a/ 
MET 


y] 


~ ir 
{9 f95 


) E Fw fi) {[~.? 5 
a’ = (72° ~“ = (/2)* =2, 


which is rational. 


4 


Relations 


4.1 Ordered Pairs and Cartesian Products 


In Chapter 1 we discussed truth sets for statements containing a single free 
variable. In this chapter we extend this idea to include statements with more 
than one free variable. 

For example, suppose P(x, y) is a statement with two free variables x and 
y. We can’t speak of this statement as being true or false until we have 
specified two values — one for x and one for y. Thus, if we want the truth set 
to identify which assignments of values to free variables make the statement 
come out true, then the truth set will have to contain not individual values, 
but pairs of values. We will specify a pair of values by writing the two 
values in parentheses separated by a comma. For example, let D(x, y) mean 
“x divides y.” Then D(6, 18) is true, since 6 | 18, so the pair of values (6, 
18) is an assignment of values to the variables x and y that makes the 
statement D(x, y) come out true. Note that 18 does not divide 6, so the pair 
of values (18, 6) makes the statement D(x, y) false. We must therefore 
distinguish between the pairs (18, 6) and (6, 18). Because the order of the 
values in the pair makes a difference, we will refer to a pair (a, b) as an 
ordered pair, with first coordinate a and second coordinate b. 

You have probably seen ordered pairs before when studying points in the 
xy-plane. The use of x and y coordinates to identify points in the plane 
works by assigning to each point in the plane an ordered pair, whose 
coordinates are the x and y coordinates of the point. The pairs must be 
ordered because, for example, the points (2, 5) and (5, 2) are different 
points in the plane. In this case the coordinates of the ordered pairs are real 
numbers, but ordered pairs can have anything at all as their coordinates. For 
example, suppose we let C(x, y) stand for the statement “x has y children.” 
In this statement the variable x ranges over the set of all people, and y 


ranges over the set of all natural numbers. Thus, the only ordered pairs it 
makes sense to consider when discussing assignments of values to the 
variables x and y in this statement are pairs in which the first coordinate is a 
person and the second is a natural number. For example, the assignment 
(Prince Charles, 2) makes the statement C(x, y) come out true, because 
Prince Charles does have two children, whereas the assignment (Angelina 
Jolie, 37) makes the statement false. Note that the assignment (2, Prince 
Charles) makes no sense, because it would lead to the nonsensical statement 
“2 has Prince Charles children.” 

In general, if P(x, y) is a statement in which x ranges over some set A and 
y ranges over a set B, then the only assignments of values to x and y that 
will make sense in P(x, y) will be ordered pairs in which the first coordinate 
is an element of A and the second comes from B. We therefore make the 
following definition: 


Definition 4.1.1. Suppose A and B are sets. Then the Cartesian product of 
A and B, denoted A x B, is the set of all ordered pairs in which the first 
coordinate is an element of A and the second is an element of B. In other 
words, 


Ax B= {(a,b)|a E Aandb E B}. 


Example 4.1.2. 
1. IfA = {red, green} and B = {2, 3, 5} then 
A x B= {(red, 2), (red, 3), (red, 5), (green, 2), (green, 3), (green, 5)}. 
2. If P= the set of all people then 


P x N= {(p.n) | p is a person and n is a natural number} 
= {(Prince Charles, 0), (Prince Charles, 1), (Prince Charles, 2),..., 
(Angelina Jolie, 0), (Angelina Jolie, 1), .. .}. 
These are the ordered pairs that make sense as assignments of values 
to the free variables x and y in the statement C(x, y). 


3. IR x R= {(x, y) | x andy are real numbers}. These are the coordinates 
of all the points in the plane. For obvious reasons, this set is 


sometimes written R2. 


The introduction of a new mathematical concept gives us an opportunity 
to practice our proof-writing techniques by proving some basic properties 
of the new concept. Here’s a theorem giving some basic properties of 
Cartesian products. 


Theorem 4.1.3. Suppose A, B, C, and D are sets. 
Ax (Bn C)=(Ax B)n (Ax C). 

Ax (BU C)=(Ax B) U (Ax C). 

(A x B)n (Cx D)=(AnC)x(Bon D). 
(A x B) U(C x D)S (AU C)x (BUD). 
AxO@=OxA=©. 


“mA Oe 


Proof of 1. Let p be an arbitrary element of A <(BnC). Then by the 
definition of Cartesian product, p must be an ordered pair whose first 
coordinate is an element of A and second coordinate is an element of B N C. 
In other words, p = (x, y) for some x E A and y E BNC. Since y E BNC, y 
€ Bandy E C. Since x E A and y € B, p = (x, y) E A x B, and similarly p 
€ A x C. Thus, p € (A x B) n (A x C). Since p was an arbitrary element of 
A x (B ^ C), it follows that A x (B n C) E (A x B) n (A x C). 

Now let p be an arbitrary element of (A x B) n (A x C). Then p E A x B, 
so p = (x, y) for some x E A and y € B. Also, (x,y) =p EGA x C, soy EC. 
Since y E B and y E C, y E Bn C. Thus, p =(x, y) E A x (B n C). Since p 
was an arbitrary element of (A x B) n (A x C) we can conclude that (A x B) 
n (A x C) SA x (B n C), so A x (B n C) = (A x B) n (A x ©). 


Commentary. Before continuing with the proofs of the other parts, we give 
a brief commentary on the proof just given. Statement 1 is an equation 
between two sets, so as we saw in Example 3.4.5, there are two natural 
approaches we could take to prove it. We could prove Vp[p € A x (B n C) 
> p E (A x B) n (A x C)] or we could prove both A x (B n C) E (Ax B) n 
(A x C) and (A x B) n (Ax C) S A x (Bn C). In this proof, we have taken 
the second approach. The first paragraph gives the proof that A x (B n C) S 


(A x B) mn (A x C), and the second gives the proof that (A x B) n(A x C) S 
Ax(Bn ©). 

In the first of these proofs we take the usual approach of letting p be an 
arbitrary element of A x (B n C) and then proving p € (A x B) n (A x C). 
Because p € A x (B n C) means Axdy(x EA NYE BNC A p=, y)), 
we immediately introduce the variables x and y by existential instantiation. 
The rest of the proof involves simply working out the definitions of the set 
theory operations involved. The proof of the opposite inclusion in the 
second paragraph is similar. 

Note that in both parts of this proof we introduced an arbitrary object p 
that turned out to be an ordered pair, and we were therefore able to say that 
p = (x, y) for some objects x and y. In most proofs involving Cartesian 
products mathematicians suppress this step. If it is clear from the beginning 
that an object will turn out to be an ordered pair, it is usually just called (x, 
y) from the outset. We will follow this practice in our proofs. 

We leave the proofs of statements 2 and 3 as exercises (see exercise 5). 


Proof of 4. Let (x, y) be an arbitrary element of (A x B) U (C x D). Then 
either (x, y) E A x Bor (x,y) E C x D. 

Case 1. (x, y) E A x B. Then x E A and y € B, so clearly x E A U C and 
y © B U D. Therefore (x, y) E (A U C) x (B U D). 

Case 2. (x, y) E C x D. A similar argument shows that (x, y) E (A U C) x 
(BUD). 

Since (x, y) was an arbitrary element of (A x B) U (C x D), it follows that 
(A x B) U(C x D) S (AUC) x (BUD). 


Proof of 5. Suppose A x © = ©. Then A x © has at least one element, and 
by the definition of Cartesian product this element must be an ordered pair 
(x, y) for some x E A and y E ©. But this is impossible, because © has no 
elements. Thus, A x © = ©. The proof that Ø x A = Ø is similar. 


Commentary. Statement 4 says that one set is a subset of another, and the 
proof follows the usual pattern for statements of this form: we start with an 
arbitrary element of the first set and then prove that it’s an element of the 
second. It is clear that the arbitrary element of the first set must be an 
ordered pair, so we have written it as an ordered pair from the beginning. 


Thus, for the rest of the proof we have (x, y) E (A x B) U (C x D) asa 
given, and the goal is to prove that (x, y) € (A U C) x (B U D). The given 
means (x, y) E A x B V (x, y) E C x D, so proof by cases is an appropriate 
strategy. In each case it is easy to prove the goal. 

Statement 5 means A x © = © A © x A = ©, so we treat this as two 
goals and prove A x © = © and © x A = Ø separately. To say that a set 
equals the empty set is actually a negative statement, although it may not 
look like it on the surface, because it means that the set does not have any 
elements. Thus, it is not surprising that the proof that A x © = © proceeds 
by contradiction. The assumption that A x Ø = © means Ap(p E A x Ø), so 
Our next step is to introduce a name for an element of A x ©. Once again, it 
is clear that the new object being introduced in the proof is an ordered pair, 
so we have written it as an ordered pair (x, y) from the beginning. Writing 
out the meaning of (x, y) E A x © leads immediately to a contradiction. 

The proof that © x A = Ø is similar, but simply saying this doesn’t prove 
it. Thus, the claim in the proof that this part of the proof is similar is really 
an indication that the second half of the proof is being left as an exercise. 
You should work through the details of this proof in your head (or if 
necessary write them out on paper) to make sure that a proof similar to the 
proof in the first half will really work. 

Because the order of the coordinates in an ordered pair matters, A x B 
and B x A mean different things. Does it ever happen that A x B = B x A? 
Well, one way this could happen is if A = B. Clearly if A= BthenA x B=A 
x A = B x A. Are there any other possibilities? 

Here’s an incorrect proof that A x B = B x A only if A = B: The first 
coordinates of the ordered pairs in A x B come from A, and the first 
coordinates of the ordered pairs in B x A come from B. But if A x B= B x 
A, then the first coordinates in these two sets must be the same, so A = B. 

This is a good example of why it’s important to stick to the rules of proof 
writing we’ve studied rather than allowing yourself to be convinced by any 
reasoning that looks plausible. The informal reasoning in the preceding 
paragraph is incorrect, and we can find the error by trying to reformulate 
this reasoning as a formal proof. Suppose A x B = B x A. To prove that A = 
B we could let x be arbitrary and then try to prove x E A > x E B and x E 
B > x E A. For the first of these we assume x € A and try to prove x E B. 
Now the incorrect proof suggests that we should try to show that x is the 


first coordinate of some ordered pair in A x B and then use the fact that A x 
B= B x A. We could do this by trying to find some object y E B and then 
forming the ordered pair (x, y). Then we would have (x, y) E A x B and A x 
B= B x A, and it would follow that (x, y) E B x A and therefore x E B. But 
how can we find an object y © B? We don’t have any given information 
about B, other than the fact that A x B = B x A. In fact, B could be the empty 
set! This is the flaw in the proof. If B = ©, then it will be impossible to 
choose y € B, and the proof will fall apart. For similar reasons, the other 
half of the proof won’t work if A = ©. 

Not only have we found the flaw in the proof, but we can now figure out 
what to do about it. We must take into account the possibility that A or B 
might be the empty set. 


Theorem 4.1.4. Suppose A and B are sets. Then A x B = B x A iff either 
A= ©, B= ©, or A=B. 


Proof. (>) Suppose A x B = B x A. If either A = © or B = ©, then there is 
nothing more to prove, so suppose A = © and B = ©. We will show that A = 
B. Let x be arbitrary, and suppose x € A. Since B = © we can choose some 
y © B. Then (x,y) E A x B=BXA,sox EB. 

Now suppose x € B. Since A = © we can choose some z € A. Therefore 
(x, z) E Bx A=A x B, sox EA. Thus A = B, as required. 

(-) Suppose either A = ©, B= ©, or A=B. 

Case 1. A= Ø. Then A x B= Ø xB=Ø=BxØ=BxA. 

Case 2. B = Ø. Similar to case 1. 

Case 3. A = B. Then A x B=AxA=BxA. 


Commentary. Of course, the statement to be proven is an iff statement, so 
we prove both directions separately. For the — direction, our goal is A = © 
V B= © V A= B, which could be written as (A = Ø V B= ©) V A = B, so 
by one of our strategies for disjunctions from Chapter 3 we can assume ~(A 
= © V B= ©) and prove A = B. Note that by one of De Morgan’s laws, =(A 
= © V B= Ø) is equivalent to A = © A B = Ø, so we treat this as two 
assumptions, A = © and B = ©. Of course we could also have proceeded 
differently, for example by assuming A = B and B = © and then proving A = 
©. But recall from the commentary on part 5 of Theorem 4.1.3 that A = © 


and B = Ø are actually negative statements, so because it is generally better 
to work with positive than negative statements, we’re better off negating 
both of them to get the assumptions A = © and B = © and then proving the 
positive statement A = B. The assumptions A = © and B = © are existential 
statements, so they are used in the proof to justify the introduction of y and 
z. The proof that A = B proceeds in the obvious way, by introducing an 
arbitrary object x and then proving x E A e x EB. 


For the < direction of the proof, we have A= © VB=OVA=Basa 
given, so it is natural to use proof by cases. In each case, the goal is easy to 
prove. 


This theorem is a better illustration of how mathematics is really done 
than most of the examples we’ve seen so far. Usually when you’re trying to 
find the answer to a mathematical question you won’t know in advance 
what the answer is going to be. You might be able to take a guess at the 
answer and you might have an idea for how the proof might go, but your 
guess might be wrong and your idea for the proof might be flawed. It is 
only by turning your idea into a formal proof, according to the rules in 
Chapter 3, that you can be sure your answer is right. Often in the course of 
trying to construct a formal proof you will discover a flaw in your 
reasoning, as we did earlier, and you may have to revise your ideas to 
overcome the flaw. The final theorem and proof are often the result of 
repeated mistakes and corrections. Of course, when mathematicians write 
up their theorems and proofs, they follow our rule that proofs are for 
justifying theorems, not for explaining thought processes, and so they don’t 
describe all the mistakes they made. But just because mathematicians don’t 
explain their mistakes in their proofs, you shouldn’t be fooled into thinking 
they don’t make any! 

Now that we know how to use ordered pairs and Cartesian products to 
talk about assigning values to free variables, we’re ready to define truth sets 
for statements containing two free variables. 


Definition 4.1.5. Suppose P(x, y) is a statement with two free variables in 
which x ranges over a set A and y ranges over another set B. Then A x B is 
the set of all assignments of values to x and y that make sense in the 
statement P(x,y). The truth set of P(x, y) is the subset of A x B consisting of 


those assignments that make the statement come out true. In other words, 
the truth set of P(x, y) is the set {(a, b) E A x B | P(a, b)}. 


Example 4.1.6. What are the truth sets of the following statements? 


1. “x has y children,” where x ranges over the set P of all people and y 
ranges over N. 


2. “x is located in y,” where x ranges over the set C of all cities and y 
ranges over the set N of all countries. 


3. “y= 2x - 3,” where x and y range over R. 


Solutions 
1. {(p, n) © PxN | the person p has n children} = {(Prince Charles, 2), 
rae 


2. {(c,n) E C x N | the city c is located in the country n} = {(New York, 
United States), (Tokyo, Japan), (Paris, France), ...}. 

3. {(x,y) E R x R| y = 2x- 3} = {(0, -3), (1, -1), (2, 1), ...}. You are 
probably already familiar with the fact that the ordered pairs in this set 
are the coordinates of points in the plane that lie along a certain straight 


line, called the graph of the equation y = 2x — 3. Thus, you can think of 
the graph of the equation as a picture of its truth set! 


Many of the facts about truth sets for statements with one free variable 
that we discussed in Chapter 1 carry over to truth sets for statements with 
two free variables. For example, suppose T is the truth set of a statement 
P(x, y), where x ranges over some set A and y ranges over B. Then for any a 
€ A and b €E B the statement (a, b) E T means the same thing as P(a, b). 
Also, if P(x, y) is true for every x E A and y € B, then T = A x B, and if 
P(x, y) is false for every x E A and y € B, then T = Ø. If S is the truth set of 
another statement Q(x, y), then the truth set of the statement P(x, y) A Q(x, 
y) is T ^ S, and the truth set of P(x, y) V Q(x, y) is TU S. 

Although we’ll be concentrating on ordered pairs for the rest of this 
chapter, it is possible to work with ordered triples, ordered quadruples, and 
so on. These might be used to talk about truth sets for statements containing 
three or more free variables. For example, let L(x, y, z) be the statement “x 


has lived in y for z years,” where x ranges over the set P of all people, y 
ranges over the set C of all cities, and z ranges over N. Then the 


assignments of values to the free variables that make sense in this statement 
would be ordered triples (p,c, n), where p is a person, c is a city, andnisa 
natural number. The set of all such ordered triples would be written P x C x 
N, and the truth set of the statement L(x, y, z) would be the set {(p, c, n) © 


P x C x N | the person p has lived in the city c for n years}. 


Exercises 


ml, 


(a) 


What are the truth sets of the following statements? List a few 
elements of each truth set. 


“x is a parent of y,” where x and y both range over the set P of all 
people. 

“There is someone who lives in x and attends y,” where x ranges 
over the set C of all cities and y ranges over the set U of all 
universities. 


What are the truth sets of the following statements? List a few 
elements of each truth set. 

“x lives in y,” where x ranges over the set P of all people and y 
ranges over the set C of all cities. 

“The population of x is y,” where x ranges over the set C of all 
cities and y ranges over N. 


The truth sets of the following statements are subsets of R?. List a 


few elements of each truth set. Draw a picture showing all the 
points in the plane whose coordinates are in the truth set. 


(a) y=x?2-x-2. 
(b) y<x. 


(c) 


Either y = x? - x- 2 ory = 3x- 2. 


(d) y <x, and either y = x? - x- 2 ory =3x-2. 
*4. Let A = {1, 2, 3}, B = {1, 4}, C = {3, 4}, and D = {5}. Compute 


all the sets mentioned in Theorem 4.1.3 and verify that all parts 
of the theorem are true. 


5. 


Prove parts 2 and 3 of Theorem 4.1.3. 


*6. What’s wrong with the following proof that for any sets A, B, C, 


7. 


PD*8. 


PD9, 


and D, (A U C) x (B U D) € (A x B) U (C x D)? (Note that this is 
the reverse of the inclusion in part 4 of Theorem 4.1.3.) 


Proof. Suppose (x, y) E (A U C) x (B U D). Then x E A U Cand 
y E B U D, so either x E A or x € C, and either y E B or y E D. 
We consider these cases separately. 
Case 1.x E A and y E B. Then (x, y) E A x B. 

Case 2. x E C and y € D. Then (x, y) E C x D. 

Thus, either (x, y) E A x B or (x, y) E C x D, so (x, y) E (A x 
B) U (C x D). 

L] 


If A has m elements and B has n elements, how many elements 
does A x B have? 

Is it true that for any sets A, B, and C, Ax(B \C) = (A x B)\(AxC)? 
Give either a proof or a counterexample to justify your answer. 
Prove that for any sets A, B, and C, A x (B A C)= (A x B)(A x ©). 


PD*10. Prove that for any sets A, B, C, and D, (A \ C) x (B\ D) S (A x 


PD 11. 


PD 12. 


13. 


14. 


B) \ (CXD). 
Prove that for any sets A, B, C, and D, (A x B) \ (C x D) = [A x 
(B\D)] U [ (A\ C) x B]. 
Prove that for any sets A, B, C, and D, if A x B and C x D are 
disjoint, then either A and C are disjoint or B and D are disjoint. 
Suppose I = ©. Prove that for any indexed family of sets {A; | i 
€ I} and any set B, (^N e Aj) x B=. € (A; x B). Where in the 
proof does the assumption that I = © get used? 
Suppose {A, | i E I} and {B; | i © I} are indexed families of sets. 


(a) Prove that U je (A; x B) E (Uie Aj) * (Uia Bi). 
(b) For each (i, j) E I x I let Ca»; = A; x B;, and let P = I x I. Prove 


that U pEP Cp = (Uic B; 


*15. This problem was suggested by Professor Alan Taylor of 
Union College, NY. Consider the following putative theorem. 


Theorem? For any sets A, B, C, and D, if A x B © C x D then 
AG CandB CD. 

Is the following proof correct? If so, what proof strategies does it 
use? If not, can it be fixed? Is the theorem correct? 


Proof. Suppose A x B © C x D. Let a be an arbitrary element of 
A and let b be an arbitrary element of B. Then (a, b) E A x B, so 
since A x B © C x D, (a, b) E C x D. Therefore a E C and b © 
D. Since a and b were arbitrary elements of A and B, respectively, 
this shows that AS C and B € D. 


O 


4.2 Relations 


Suppose P(x, y) is a statement with two free variables x and y. Often such a 
statement can be thought of as expressing a relationship between x and y. 
The truth set of the statement P(x, y) is a set of ordered pairs that records 
when this relationship holds. In fact, it is often useful to think of any set of 
ordered pairs in this way, as a record of when some relationship holds. This 
is the motivation behind the following definition. 


Definition 4.2.1. Suppose A and B are sets. Then a set R S A x B is called a 
relation from A to B. 


If x ranges over A and y ranges over B, then clearly the truth set of any 
statement P(x, y) will be a relation from A to B. However, note that 
Definition 4.2.1 does not require that a set of ordered pairs be defined as the 
truth set of some statement for the set to be a relation. Although thinking 
about truth sets was the motivation for this definition, the definition says 
nothing explicitly about truth sets. According to the definition, any subset 
of A x Bis to be called a relation from A to B. 


Example 4.2.2. Here are some examples of relations from one set to 
another. 


1. LetA= {1, 2, 3}, B= {3, 4, 5}, and R = {(1, 3), (1, 5), (3, 3)}. Then R 
G AxB, so R is a relation from A to B. 

2. Let G= {(x, y) E Rx R|x > y}. Then G is a relation from R to R. 

3. LetA= {1, 2} and B= MA) = {Ø, {1}, {2}, {1, 2}}. Let E = {(x, y) 
€ A x B|x €y}. Then E is a relation from A to B. In this case, E = 
(1, {1}, (L {1, 2), (2, {2), (2 {1 2). 

For the next three examples, let S be the set of all students at your 


school, R the set of all dorm rooms, P the set of all professors, and C 
the set of all courses. 


4. Let L= {(s, r) E S x R | the student s lives in the dorm room r}. Then 
L is a relation from S to R. 


5. Let E = {(s, c) E S x C | the student s is enrolled in the course c}. 
Then E is a relation from S to C. 


6. Let T= {(c, p) E C x P | the course c is taught by the professor p}. 
Then T is a relation from C to P. 


So far we have concentrated mostly on developing your proof-writing 
skills. Another important skill in mathematics is the ability to understand 
and apply new definitions. Here are the definitions for several new concepts 
involving relations. We’ll soon give examples illustrating these concepts, 
but first see if you can understand the concepts based on their definitions. 


Definition 4.2.3. Suppose R is a relation from A to B. Then the domain of R 
is the set 


Dom(R) = {a E A | db € B((a, b) E R)}. 
The range of R is the set 
Ran(R) = {b € B | Ja € A((a, b) E R)}. 
The inverse of R is the relation R`! from B to A defined as follows: 
R`! = {(b, a) E Bx A {(a, b) E R}. 


Finally, suppose R is a relation from A to B and S is a relation from B to C. 
Then the composition of S and R is the relation S ° R from A to C defined as 


follows: 
Se R= {(a,c) EAX C| 4b € Bia, b) E Rand (b, c) E S)}. 


Notice that we have assumed that the second coordinates of pairs in R and 
the first coordinates of pairs in S both come from the same set B, because 
that is the situation in which we will most often be interested in S ° R. 
However, this restriction is not really necessary, as we ask you to show in 
exercise 15. 


According to Definition 4.2.3, the domain of a relation from A to B is the 
set containing all the first coordinates of ordered pairs in the relation. This 
will in general be a subset of A, but it need not be all of A. For example, 
consider the relation L from part 4 of Example 4.2.2, which pairs up 
students with the dorm rooms in which they live. The domain of L would 
contain all students who appear as the first coordinate in some ordered pair 
in L — in other words, all students who live in some dorm room — but would 
not contain, for example, students who live in apartments off campus. 
Working it out more carefully from the definition as stated, we have 


Dom(L) = {s € S| ar e R((s,r) e L)} 


II 
Ta 
M 


= S | 3r e R(the student s lives in the dorm room r)} 


Il 
mi 
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5 € S | the student s lives in some dorm room}. 


Similarly, the range of a relation is the set containing all the second 
coordinates of its ordered pairs. For example, the range of the relation L 
would be the set of all dorm rooms in which some student lives. Any dorm 
rooms that are unoccupied would not be in the range of L. 

The inverse of a relation contains exactly the same ordered pairs as the 
original relation, but with the order of the coordinates of each pair reversed. 
Thus, in the case of the relation L, if Joe Smith lives in room 213 Davis 
Hall, then (Joe Smith, 213 Davis Hall) € L and (213 Davis Hall, Joe Smith) 
€ LL. In general, for any student s and dorm room r, we would have (r, s) 
€E L! iff (s, r) E L. For another example, consider the relation G from part 
2 of Example 4.2.2. It contains all ordered pairs of real numbers (x, y) in 
which x is greater than y. We might call it the “greater-than” relation. Its 
inverse is 


G7! ={(x,y) eRxR|(y,x) € G} 
={(x,y)ERxR|y>x} 
={(x,vy)EeRxR|x <y} 


In other words, the inverse of the greater-than relation is the less-than 
relation! 

The most difficult concept introduced in Definition 4.2.3 is the concept 
of the composition of two relations. For an example of this concept, 
consider the relations E and T from parts 5 and 6 of Example 4.2.2. Recall 
that E is a relation from the set S of all students to the set C of all courses, 
and T is a relation from C to the set P of all professors. According to 
Definition 4.2.3, the composition T ° E is the relation from S to P defined as 
follows: 


T o E = {(s,p)e Sx P |3c €C((s,c) € E and (c,p) € T)} 
= {(s,p) € S x P | dc e C(the student s is enrolled in the course c 
and the course c is taught by the professor p)} 
= {(s,p) € S x P | the student s is enrolled in some course taught 


by the professor p}. 


Thus, if Joe Smith is enrolled in Biology 12 and Biology 12 is taught by 
Prof. Evans, then (Joe Smith, Biology 12) E E and (Biology 12, Prof. 
Evans) € T, and therefore (Joe Smith, Prof. Evans) € T ° E. In general, if s 
is some particular student and p is a particular professor, then (s, p E T° E 
iff there is some course c such that (s, c) E E and (c, p) E T. This notation 
may seem backward at first. If (s, c) E E and (c, p) € T, then you might be 
tempted to write (s, p) © E ° T, but according to our definition, the proper 
notation is (s, p) E T ° E. The reason we’ve chosen to write compositions of 
relations in this way will become clear in Chapter 5. For the moment, yov’ll 
just have to be careful about this notational detail when working with 
compositions of relations. 


Example 4.2.4. Let S, R, C, and P be the sets of students, dorm rooms, 
courses, and professors at your school, as before, and let L, E, and T be the 
relations defined in parts 4-6 of Example 4.2.2. Describe the following 
relations. 


1. Et, 


6. 


2 
3 
4. 
5 


E-L}. 
Eo, 
E-E}. 

Te (E ° LŻ}). 
(Te E) ° L. 


Solutions 


1. 


E`! = {(c, s) E C x S |(s, c) E E} = {(c, s) E C x S | the student s is 
enrolled in the course c}. For example, if Joe Smith is enrolled in 
Biology 12, then (Joe Smith, Biology 12) E E and (Biology 12, Joe 
Smith) € Et. 

Because L’! is a relation from R to S and E is a relation from S to C, 
EL! will be the relation from R to C defined as follows. 


Eo L`! = {(r,c) € R x C | as € S((r,s) € L~ and (s,c) € E)} 
= {(r,c) e R x C | As € S((s,r) € L and (s,c) € E)} 
= {(r,c) € R x C | As € S(the student s lives in the dorm 
room r and is enrolled in the course c)} 
= {(r,c) € R x C | some student who lives in the room r 


is enrolled in the course c}. 


Returning to our favorite student Joe Smith, who is enrolled in 
Biology 12 and lives in room 213 Davis Hall, we have (213 Davis 
Hall, Joe Smith) E€ L~! and (Joe Smith, Biology 12)€ E. According 
to the definition of composition, it follows that (213 Davis Hall, 
Biology 12) E€ Ee Lt, 


Because E is a relation from S to C and E~! is a relation from C to S, 
E! ° E is the relation from S to S defined as follows. 


E`! o E = {(s,t) € S x S | 3c € C((s,c) € E and (c,t) € E~')} 
= {(s,t) € S x § | de € C(the student s is enrolled in the 
course c, and so is the student r)} 
= {(s,t) € S x S | there is some course that the students s 


and f are both enrolled in}. 


(Note that an arbitrary element of S x S is written (s, t), not (s, $), 
because we don’t want to assume that the two coordinates are equal.) 
This is not the same as the last example! Because E! is a relation 
from C to S and E is a relation from S to C, E ° E™! is a relation from 
C to C. It is defined as follows. 


Eo E7! = {(c,d) € C x C | Es e S((c,s) € E~! and (s,d) € E)} 
= {(c,d) € C x C | As € S(the student s is enrolled in the 

course c, and he or she is also enrolled in the course d)} 

= {(c,d) € C x C | there is some student who is enrolled in 


both of the courses c and d}. 


We saw in part 2 that E e L™t is a relation from R to C, and T is a 
relation from C to P, so T ° (E ° L`}) is the relation from R to P 
defined as follows. 


T o (Eo L7!) = {(r,p) € R x P | ac € C((r,c) € E o L7! and 
(c. p)eT)} 
= {(r,p) € R x P | dc € C(some student who lives in the 
room r is enrolled in the course c, and c is taught by 
the professor p)} 
= {(r,p) € R x P | some student who lives in the room r 


is enrolled in some course taught by the professor p}. 


(T o E)o L7! = {(r,p) € R x P | as e S((r,s) € L7! and 
(s,p)E€ToE)} 
= {(r,p) € R x P | As € S(the student s lives in the 
6. room r, and is enrolled in some course taught by 
the professor p)} 
= {(r,p) € R x P | some student who lives in the room r 


is enrolled in some course taught by the professor p}. 


Notice that our answers for parts 3 and 4 of Example 4.2.4 were different, 
so composition of relations is not commutative. However, our answers for 
parts 5 and 6 turned out to be the same. Is this a coincidence, or is it true in 
general that composition of relations is associative? Often, looking at 
examples of a new concept will suggest general rules that might apply to it. 
Although one counterexample is enough to show that a rule is incorrect, we 
should never accept a rule as correct without a proof. The next theorem 
summarizes some of the basic properties of the new concepts we have 
introduced. 


Theorem 4.2.5. Suppose R is a relation from A to B, S is a relation from B 
to C, and T is a relation from C to D. Then: 


(RITER. 

Dom(R‘) = Ran (R). 
Ran(R ‘) = Dom (R). 
T° (S° R) = (T° S)eR. 
(CR =R 'e5 1 


m a Do oN mK 


Proof. We will prove 1, 2, and half of 4, and leave the rest as exercises. (See 
exercise 7.) 


1. First of all, note that R™t is a relation from B to A, so (RİY! is a 
relation from A to B, just like R. To see that (RİY! = R, let (a, b) be 
an arbitrary ordered pair in A x B. Then 


(a, b) €(R"')! iff (b, a) E R" iff (a, b) E R. 


2. First note that Dom (R’!) and Ran (R) are both subsets of B. Now let 
b be an arbitrary element of B. Then 


b € Dom(R7!) iff 3a € A((b,a) € R7!) 
iff Ja € A( (a,b) € R) iff b e Ran( R). 


4. Clearly T ° (S ° R) and (T ° S) ° R are both relations from A to D. Let 
(a, d) be an arbitrary element of A x D. 
First, suppose (a, d) E T ° (S ° R). By the definition of composition, 
this means that we can choose some c € C such that (a,c) E S° R 
and (c, d) € T. Since (a, c) E S ° R, we can again use the definition 
of composition and choose some b € B such that (a, b) E R and (b, c) 
€ S. Now since (b, c) E S and (c, d) E T, we can conclude that (b, d) 
€ T° S. Similarly, since (a, b) E R and (b, d) E T ° S, it follows that 
(a, d) E (T° S) ° R. 
Now suppose (a, d) E (T° S) ° R. A similar argument, which is left 
to the reader, shows that (a, d) E T ° (S ° R). Thus, T ° (S ° R) = (T ° 
S) ° R. 
L] 
Commentary. Statement 1 means Vp(p €(R'') | © p E R), so the proof 
should proceed by introducing an arbitrary object p and then proving p © 
(RIY! e p ER. But because R and (Rty + are both relations from A to B, 
we could think of the universe over which p ranges as being A x B, so p 
must be an ordered pair. Thus, in the preceding proof we’ve written it as an 
ordered pair (a, b) from the start. The proof of the biconditional statement 
(a, b) &(R')+ © (a, b) E R uses the method, introduced in Example 
3.4.5, of stringing together a sequence of equivalences. 

The proofs of statements 2 and 4 are similar, except that the biconditional 
proof for statement 4 cannot easily be done by stringing together 
equivalences, so we prove the two directions separately. Only one direction 
was proven. The key to this proof is to recognize that the given (a, d) E T ° 
(S ° R) is an exis-tential statement, since it means dc E C((a, c) E S ° Rand 
(c, d) E T), so we should introduce a new variable c into the proof to stand 
for some element of C such that (a, c) E SeR and (c, d) E T. Similarly, (a, 
c) E SoR is an existential statement, so it suggests introducing the variable 


b. Once these new variables have been introduced, it is easy to prove the 
goal (a, d) E (T° S) ° R. 


Statement 5 of Theorem 4.2.5 perhaps deserves some comment. First of 
all, notice that the right-hand side of the equation is Rt o S~t, not S~t o R~}; 
the order of the relations has been reversed. You are asked to prove 
statement 5 in exercise 7, but it might be worthwhile to try an example first. 
We’ve already seen that, for the relations E and T from parts 5 and 6 of 
Example 4.2.2, 

T o E = {(s, p) € S x P | the student s is enrolled in some course 


taught by the professor p}. 
It follows that 


(T o EJ! = {(p,s) € P x S | the student s is enrolled in some course 


taught by the professor p}. 


To compute E~! © T £, first note that T! is a relation from P to C and E~! 
is a relation from C to S, so Et e T" is a relation from P to S. Now, 
applying the definition of composition, we get 

E`! o T7! = {(p,s) € P x S| 3c € C((p,c) € T7! and (c,s) € E~')} 
= {(p,s)e P x S| 3c e C((c,p) € T and (s,c) € E)} 
= {(p.s) € P x S | dc € C(the course c is taught by the 
professor p and the student s is enrolled in the course c)} 
= {(p.s) € P x S | the student s is enrolled in some course 


taught by the professor p}. 


Thus, (Tc E) t= Eto T+. 


Exercises 


*1. Find the domains and ranges of the following relations. 


(a) {(p, q) EP x P | the person p is a parent of the person q}, where P is 
the set of all living people. 
(b) {(x, y) ER? |y > x*}. 


2. Find the domains and ranges of the following relations. 

(a) {(p, q) EP x P| the person p is a brother of the person q}, where P is 
the set of all living people. 

(b) {(x, y) E R*|y* =1- 24x27 + 1}. 


3. Let L and E be the relations defined in parts 4 and 5 of Example 4.2.2. 
Describe the following relations: 

aL eL 

(b) EL! L). 


4. Let E and T be the relations defined in parts 5 and 6 of Example 4.2.2. 
Also, as in that example, let C be the set of all courses at your school, 
and let D = {Monday, Tuesday, Wednesday, Thursday, Friday}. Let M 
= {(c, d) E C x D | the course c meets on the day d}. Describe the 
following relations: 

(a) McE. 

(b) Mc TH. 


*5. Suppose that A = {1, 2, 3}, B= {4, 5, 6}, R= {(1, 4), (1, 5), (2, 5), (3, 
6)}, and S = {(4, 5), (4, 6), (5, 4), (6, 6)}. Note that R is a relation 
from A to B and S is a relation from B to B. Find the following 
relations: 

(a) SoR. 

(b) SoS], 


6. Suppose that A = {1, 2, 3}, B= {4, 5}, C= {6, 7, 8}, R= {(1, 7), (3, 
6), (3, 7)}, and S = {(4, 7),(4, 8),(5, 6)}. Note that R is a relation from 
A to C and S is a relation from B to C. Find the following relations: 

(a) Ste R. 

(b) Rte sS. 

7. (a) Prove part 3 of Theorem 4.2.5 by imitating the proof of part 2 in 
the text. 


(b) Give an alternative proof of part 3 of Theorem 4.2.5 by showing that it 
follows from parts 1 and 2. 

(c) Complete the proof of part 4 of Theorem 4.2.5. 

(d) Prove part 5 of Theorem 4.2.5. 


*8. Let E = {(p, q) E P x P | the person p is an enemy of the person q}, 
and F = {(p, q) E P x P | the person p is a friend of the person q}, 
where P is the set of all people. What does the saying “an enemy of 
one’s enemy is one’s friend” mean about the relations E and F? 


9. Suppose R is a relation from A to B and S is a relation from B to C. 

(a) Prove that Dom (S ° R) © Dom (R). 

(b) Prove that if Ran (R) © Dom (S) then Dom (S ° R) = Dom (R). 

(c) Formulate and prove similar theorems about Ran (S ° R). 

10. Suppose R and S are relations from A to B. Must the following state- 
ments be true? Justify your answers with proofs or counterexamples. 

(a) RE Dom (R)x Ran (R). 

(b) If R S Sthen RE S S+. 

(c) (RUS) = Rt u SH. 

*11. Suppose R is a relation from A to B and S is a relation from B to C. 
Prove that S ° R= Ø iff Ran (R) and Dom (S) are disjoint. 


PD12. Suppose R is a relation from A to B and S and T are relations from 
B to C. 

(a) Prove that (S ° R)\ (T° R) E (S\T)e°R. 

(b) What’s wrong with the following proof that (S \ T) > R S (S ° R) \ 
(TeR) ? 


Proof. Suppose (a, c) € (S \ T) ° R. Then we can choose some b € B 
such that (a, b) E R and (b, c) € S\ T, so (b, c) E€ S and (b, c) Ẹ T. 
Since (a, b) E R and (b, c) E S, (a, c) E S ° R. Similarly, since (a, b) 
€ R and (b, c) ET, (a, c) É T ° R. Therefore (a, c) € (S ° R) \ (T ° 
R). Since (a, c) was arbitrary, this shows that (S \ T) ° R S (S ° R) \ (T 
o R). 


(c) Must it be true that (S \ T) ° R S (S ° R) \ (T° R)? Justify your answer 
with either a proof or a counterexample. 


13. Suppose R and S are relations from A to B and T is a relation from B 
to C. Must the following statements be true? Justify your answers 
with proofs or counterexamples. 


(a) If R and S are disjoint, then so are R™t and S~t. 


(b) If R and S are disjoint, then so are T° Rand T° S. 
(c) If T° Rand T° S are disjoint, then so are R and S. 


PD14. Suppose R is a relation from A to B, and S and T are relations from 
B to C. Must the following statements be true? Justify your answers 
with proofs or counterexamples. 

(a) fSGTthnSecREToR. 

b) (SN T)°RE(Se°R)n(TeR). 

(c) (Sn T)eR=(SeR)n (T° R). 

(d) (SU T)e R=(S° R)U (TR). 


15. Suppose R is a relation from A to B and S is a relation from C to D. 
Show that there is a set E such that R is a relation from A to E and S is 
a relation from E to D, and therefore the definition of S ° R in 
Definition 4.2.3 can be applied. Furthermore, the definition gives the 
same result no matter which such set E is used. 


4.3 More About Relations 


Although we have defined relations to be sets of ordered pairs, it is 
sometimes useful to be able to think about them in other ways. Often even a 
small change in notation can help us see things differently. One alternative 
notation that mathematicians sometimes use with relations is motivated by 
the fact that in mathematics we often express a relationship between two 
objects x and y by putting some symbol between them. For example, the 
notations x = y, x < y, x © y, and x © y express four important 
mathematical relationships between x and y. Imitating these notations, if R 
is a relation from A to B, x € A, and y € B, mathematicians sometimes 
write xRy to mean (x, y) E R. 

For example, if L is the relation defined in part 4 of Example 4.2.2, then 
for any student s and dorm room r, sLr means (s, r) E L, or in other words, 
the student s lives in the dorm room r. Similarly, if E and T are the relations 
defined in parts 5 and 6 of Example 4.2.2, then sEc means that the student s 
is enrolled in the course c, and cTp means that the course c is taught by the 
professor p. The definition of composition of relations could have been 
stated by saying that if R is a relation from A to B and S is a relation from B 
to C, then S ° R= {(a, c) E A x C | Jb E B(aRb and bSc)}. 


Another way to think about relations is to draw pictures of them. Figure 
4.1 shows a picture of the relation R = {(1, 3), (1, 5), (3, 3)} from part 1 of 
Example 4.2.2. Recall that this was a relation from the set A = {1, 2, 3} to 
the set B = {3, 4, 5}. In the figure, each of these sets is represented by an 
oval, with the elements of the set represented by dots inside the oval. Each 
ordered pair (a, b) € R is represented by an arrow from the dot representing 
a to the dot representing b. For example, there is an arrow from the dot 
inside A labeled 1 to the dot inside B labeled 5 because the ordered pair (1, 
5) is an element of R. 

In general, any relation R from a set A to a set B can be represented by 
such a picture. The dots representing the elements of A and B in such a 
picture are called vertices, and the arrows representing the ordered pairs in 
R are called edges. It doesn’t matter exactly how the vertices representing 
elements of A and B are arranged on the page; what’s important is that the 
edges correspond precisely to the ordered pairs in R. Drawing these pictures 
may help you understand the concepts discussed in the last section. For 
example, you should be able to convince yourself that you could find the 
domain of R by locating those vertices in A that have edges pointing away 
from them. Similarly, the range of R would consist of those elements of B 
whose vertices have edges pointing toward them. For the relation R shown 
in Figure 4.1, we have Dom (R) = {1, 3} and Ran (R) = {3, 5}. A picture of 
Rt would look just like a picture of R but with the directions of all the 
arrows reversed. 


A B 


< 


Figure 4.1. 


Pictures illustrating the composition of two relations are a little harder to 
understand. For example, consider again the relations E and T from parts 5 
and 6 of Example 4.2.2. Figure 4.2 shows what part of both relations might 
look like. (The complete picture might be quite large if there are many 
students, courses, and professors at your school.) We can see in this picture 
that, for example, Joe Smith is taking Biology 12 and Math 21, that Biology 
12 is taught by Prof. Evans, and that Math 21 is taught by Prof. Andrews. 
Thus, applying the definition of composition, we can see that the pairs (Joe 
Smith, Prof. Evans) and (Joe Smith, Prof. Andrews) are both elements of 
the relation T° E. 


Biology 12 
English 24 
D 


Math 21 
D 


e Prof. Evans 
e Prof. Lewis 


Mary Edwards e e Prof. Andrews 


e 
Math 13 


Figure 4.2. 


To see more clearly how the composition T e E is represented in this 
picture, first note that for any student s, course c, and professor p, there is 
an arrow from s to c iff sEc, and there is an arrow from c to p iff cTp. Thus, 
according to the definition of composition, 

ToE={(s,p)¢S x P | 3c e C(sEc and cTp)} 
= {(s,p) € S x P | 3c € C(in Figure 4.2, there is an arrow 
from s to c and an arrow from c to p)} 
= {(s,p) € S x P | in Figure 4.2, you can get from s to p in 


two steps by following the arrows}. 


For example, starting at the vertex labeled Mary Edwards, we can get to 
Prof. Andrews in two steps (going by way of either Math 21 or Math 13), so 
we can conclude that (Mary Edwards, Prof. Andrews) E T ° E. 


In some situations we draw pictures of relations in a slightly different 
way. For example, if A is a set and R € A x A, then according to Definition 
4.2.1, R would be called a relation from A to A. Such a relation is also 
sometimes called a relation on A (or a binary relation on A). Relations of 
this type come up often in mathematics; in fact, we have already seen a few 
of them. For example, we described the relation G in part 2 of Example 
4.2.2 as a relation from R to R, but in our new terminology we could call it 


a relation (or a binary relation) on R. The relation E~} o E from Example 


4.2.4 was a relation on the set S, and E °- E~! was a relation on C. 


Example 4.3.1. Here are some more examples of relations on sets. 


1. Let A= {1, 2} and B= AA) = {@, {1}, {2}, {1, 2}} as in part 3 of 
Example 4.2.2. Let 
S={(x.yheBxBl|xCy} 
= {(2, B), (2, {1}), (2, {2}), (2, {1,2}), 1}. {1p, d1}, (1, 2p), 
({2}, {2}), ({2}. {1.2}, ({1, 2}, (1, 2))}. 
Then S is a relation on B. 


2. Suppose A is a set. Let i, = {(x, y) E A x A |x= y}. Then i, is a 
relation on A. (It is called the identity relation on A.) For example, if 
A= {1, 2, 3}, then i, = {(1, 1),Q, 2),(3, 3)}. Note that i, could also be 
defined by writing i, = {(x, x) |x E A}. 


3. For each positive real number r, let D, = {(x, y) E R x R | x and y 
differ by less than r, or in other words |x — y| < r}. Then D, is a 
relation on R. 


Suppose R is a relation on a set A. If we used the method described 
earlier to draw a picture of R, then we would have to draw two copies of the 
set A and then draw edges from one copy of A to the other to represent the 
ordered pairs in R. An easier way to draw the picture would be to draw just 
one copy of A and then connect the vertices representing the elements of A 
with edges to represent the ordered pairs in R. For example, Figure 4.3 
shows a picture of the relation S from part 1 of Example 4.3.1. Pictures like 
the one in Figure 4.3 are called directed graphs. 


Pa 
Ce 


Figure 4.3. 


Note that in this directed graph there is an edge from Ø to itself, because 
(©, Ø)E S. Edges such as this one that go from a vertex to itself are called 
loops. In fact, in Figure 4.3 there is a loop at every vertex, because S has the 
property that Vx E€ B((x, x) E S). We describe this situation by saying that S 
is reflexive. 


Definition 4.3.2. Suppose R is a relation on A. 


1. R is said to be reflexive on A (or just reflexive, if A is clear from 
context) if Vx E A(xRx), or in other words Vx E A((x, x) E R). 


2. Ris symmetric if Vx E A Vy E A(xRy > yRx). 
3. Ris transitive if Vx E A Vy E A Wz E A((xRy A yRz) > xRz). 


As we saw in Example 4.3.1, if R is reflexive on A, then the directed 
graph representing R will have loops at all vertices. If R is symmetric, then 
whenever there is an edge from x to y, there will also be an edge from y to 
x. If x and y are distinct, it follows that there will be two edges connecting x 
and y, one pointing in each direction. Thus, if R is symmetric, then all edges 
except loops will come in such pairs. If R is transitive, then whenever there 
is an edge from x to y and y to z, there is also an edge from x to z. 


Example 4.3.3. Is the relation G from part 2 of Example 4.2.2 reflexive? 
Is it symmetric? Transitive? Are the relations in Example 4.3.1 reflexive, 
symmetric, or transitive? 


Solution 


Recall that the relation G from Example 4.2.2 is a relation on R and that for 
any real numbers x and y, xGy means x > y. Thus, to say that G is reflexive 
would mean that Vx E R(xGx), or in other words Vx E R(x > x), and this is 
clearly false. To say that G is symmetric would mean that Vx E RVy E R(x 
> y > y > x), and this is also clearly false. Finally, to say that G is 
transitive would mean that Vx E RVy E RVz E R(x >y Ny >z) > x> 
z), and this is true. Thus, G is transitive, but not reflexive or symmetric. 

The analysis of the relations in Example 4.3.1 is similar. For the relation 
S in part 1 we use the fact that for any x and y in B, xSy means x © y. As we 
have already observed, S is reflexive, since Vx € B(x © x), but it is not true 
that Vx E BVy € B(x S y > y Ex). For example, {1} © {1, 2}, but {1, 2} 
Æ {1}. You can see this in Figure 4.3 by noting that there is an edge from 
{1} to {1, 2} but not from {1, 2} to {1}. Thus, S is not symmetric. S is 
transitive, because the statement Vx E BVy E BVz E€ B((x Sy Ay Ez) > 
x © z) is true. 

For any set A the identity relation i, will be reflexive, symmetric, and 
transitive, because the statements Vx E A(x = x), Vx E AVy E A(x =y > y 
= x), and Vx € AVy E AVz E A((x =y A y=Z) > x = z) are all clearly 
true. Finally, suppose r is a positive real number and consider the relation 
D,. For any real number x, |x — x| = 0 < r, so (x, x) © D,. Thus, D, is 
reflexive. Also, for any real numbers x and y, |x — y| = ly — x|, so if |x- y| < r 
then |y — x| < r. Therefore, if (x, y) © D, then (y, x) E D,, so D, is 
symmetric. But D, is not transitive. To see why, let x be any real number. 
Let y =x +2r/3 and z = y +2r/3 = x +4r/3. Then |x -y| = 2r/3 < r and ly-z| = 
2r/3 < r, but |x — z| = 4r/3 > r. Thus, (x, y) E D, and (y, z) E D, but (x, z) 
€/ D, 


Perhaps you’ve already guessed that the properties of relations defined in 
Definition 4.3.2 are related to the operations defined in Definition 4.2.3. To 


say that a relation R is symmetric involves reversing the roles of two 
variables in a way that may remind you of the definition of Rt. The 
definition of transitivity of a relation involves stringing together two 
ordered pairs, just as the definition of composition of relations does. The 
following theorem spells these connections out more carefully. 


Theorem 4.3.4. Suppose R is a relation ona set A. 


1. Ris reflexive iff i, & R, where as before i,is the identity relation on A. 
2. Ris symmetric iff R= R+. 
3. Ris transitive iff Ro R E R. 


Proof. We will prove 2 and leave the proofs of 1 and 3 as exercises (see 
exercises 7 and 8). 

2. (>) Suppose R is symmetric. Let (x, y) be an arbitrary element of R. 
Then xRy, so since R is symmetric,yRx. Thus,(y, x) E R, so by the definition 
of R, (x, y) € Rt. Since (x, y) was arbitrary, it follows that R & R™+. 

Now suppose (x, y) E R'‘. Then (y, x) € R, so since R is symmetric, (x, 
y) © R. Thus,R™! S R, so R= R. 

(-) Suppose R = R™t, and let x and y be arbitrary elements of A. Suppose 
xRy. Then (x, y) € R, so since R = R'!,(x, y) E R'!. By the definition of Rt 
this means (y,x) E R, so yRx. Thus, Vx E A Vy E A(xRy > yRx), so R is 
symmetric. 

L] 


Commentary. This proof is fairly straightforward. The statement to be 
proven is an iff statement, so we prove both directions separately. In the — 
half we must prove that R = R}, and this is done by proving both R & Rt 
and R™t S R. Each of these goals is proven by taking an arbitrary element 
of the first set and showing that it is in the second set. In the < half we 
must prove that R is symmetric, which means Vx E AVy E A(xRy > yRx). 
We use the obvious strategy of letting x and y be arbitrary elements of A, 
assuming xRy, and proving yRx. 


Exercises 


*1. Let L = {a, b, c, d, e} and W= {bad, bed, cab}. Let R= {(l, w) E L x 
W | the letter / occurs in the word w}. Draw a diagram (like the one in 
Figure 4.1) of R. 


2. Let A= {cat, dog, bird, rat}, and let R = {(x, y) E A x A | there is at 
least one letter that occurs in both of the words x and y}. Draw a 
directed graph (like the one in Figure 4.3) for the relation R. Is R 
reflexive? Symmetric? Transitive? 


*3. Let A = {1, 2, 3, 4}. Draw a directed graph for i,, the identity relation 
on A. 
4. List the ordered pairs in the relations represented by the directed 


graphs in Figure 4.4. Determine whether each relation is reflexive, 
symmetric, or transitive. 


(a) (b) 
a c a c 
COO) & 
A ~~ a 
b œ d b d 
(c) a (d) a c 
CO D | 
-_™s™ 
Ge ne b d 


Figure 4.4. 


*5. Figure 4.5 shows two relations R and S. Find S ° R. 


6. Suppose r and s are two positive real numbers. Let D, and D, be 
defined as in part 3 of Example 4.3.1. What is D, ° D,? Justify your 


answer with a proof. (Hint: In your proof, you may find it helpful to 
use the triangle inequality; see exercise 13(c) of Section 3.5.) 


*7, Prove part 1 of Theorem 4.3.4. 
8. Prove part 3 of Theorem 4.3.4. 


9. Suppose A and B are sets. 
(a) Show that for every relation R from A to B, Rei, = R. 
(b) Show that for every relation R from A to B, ige R= R. 


*10. Suppose S is a relation on A. Let D = Dom (S) and R = Ran (S). 
Prove that ip © S~} ° Sandip S S ° S~. 


11. Suppose R is a relation on A. Prove that if R is reflexive then R © R ° 
R. 


Figure 4.5. 


12. Suppose R is a relation on A. 


(a) Prove that if R is reflexive, then so is R! 
(b) Prove that if R is symmetric, then so is Rt. 
(c) Prove that if R is transitive, then so is Rt. 


*13. Suppose R} and R, are relations on A. For each part, give either a 
proof or a counterexample to justify your answer. 
(a) If R, and R, are reflexive, must R4 U R, be reflexive? 


(b) If R; and R, are symmetric, must R, U R, be symmetric? 
(c) If R, and R, are transitive, must R, U R, be transitive? 


14. Suppose R, and R, are relations on A. For each part, give either a 
proof or a counterexample to justify your answer. 

(a) If R; and R, are reflexive, must R; N R, be reflexive? 

(b) If R, and R, are symmetric, must R,; N R, be symmetric? 

(c) If R, and R, are transitive, must R} N R, be transitive? 


15. Suppose R, and R, are relations on A. For each part, give either a 
proof or a counterexample to justify your answer. 
(a) If R, and R, are reflexive, must R} \ R, be reflexive? 
(b) If R, and R, are symmetric, must R, \ R, be symmetric? 
(c) If R; and R, are transitive, must R, \ R, be transitive? 


16. Suppose R and S are reflexive relations on A. Prove that R°S is 
reflexive. 


*17. Suppose R and S are symmetric relations on A. Prove that R ° S is 
symmetric iff R ° S = S ° R. 

18. Suppose R and S are transitive relations on A. Prove thatif S° RS R° 
S then R ° S is transitive. 


19. Consider the following putative theorem. 


Theorem? Suppose R is a relation on A, and define a relation S on 
P(A) as follows: 


S= {(X, Y) E AA) x P(A) | IX E X Ay E Y(xRy)}. 
If R is transitive, then so is S. 


(a) What’s wrong with the following proof of the theorem? 
Proof. Suppose R is transitive. Suppose (X, Y) E S and (Y, Z) € S. 
Then by the definition of S, xRy and yRz, where x E X, y E Y, and z E 
Z. Since xRy, yRz, and R is transitive, xRz. But then since x € X and z 
€ Z, it follows from the definition of S that (X, Z) © S. Thus, S is 
transitive. 


L 
(b) Is the theorem correct? Justify your answer with either a proof or a 
counterexample. 


*20. Suppose R is a relation on A. Let B = {X E AA) | X # Ø}, and 
define a relation S on B as follows: 


S= {(X, Y) E B x B | Vx E X Vy E Y(xRy)}. 
Prove that if R is transitive, then so is S. Why did the empty set have 
to be excluded from the set B to make this proof work? 


21. Suppose R is a relation on A, and define a relation S on (A) as 
follows: 


S= {(X, Y) E AA) x P(A) | Vx E X Ay E Y(xRy)}. 
For each part, give either a proof or a counterexample to justify your 


answer. 


(a) If R is reflexive, must S be reflexive? 
(b) If R is symmetric, must S be symmetric? 
(c) if R is transitive, must S be transitive? 


22. Consider the following putative theorem: 


Theorem? Suppose R is a relation on A. If R is symmetric and 
transitive, then R is reflexive. 


Is the following proof correct? If so, what proof strategies does it use? 
If not, can it be fixed? Is the theorem correct? 


Proof. Let x be an arbitrary element of A. Let y be any element of A 
such that xRy. Since R is symmetric, it follows that yRx. But then by 
transitivity, since xRy and yRx we can conclude that xRx. Since x was 
arbitrary, we have shown that Vx E A(xRx), so R is reflexive. 


O 


*23. This problem was suggested by Professor William Zwicker of Union 
College, NY. Suppose A is a set, and FSE (A). Let R = {(a, b) E A x 


A | for every X © A \ {a,b}, if X U {a} E Athen X U {b} E Z}. Show 
that R is transitive. 


24. Let R= {(m, n) © N x N | |m -n| < 1}, which is a relation on N. Note 
that R © Z x Z, so R is also a relation on Z. This exercise will illustrate 


why, in part 1 of Definition 4.3.2, we defined the phrase “R is reflexive 
on A,” rather than simply “R is reflexive.” 


(a) Is Rreflexive on N? 
(b) Is R reflexive on Z? 


4.4 Ordering Relations 


Consider the relation L = {(x, y) E R x R |x < y}. You should be able to 


check for yourself that it is reflexive and transitive, but not symmetric. It 
fails to be symmetric in a rather extreme way because there are many pairs 
(x, y) such that xLy is true but yLx is false. In fact, the only way xLy and yLx 
can both be true is if x < y andy < x, and thus x = y. We therefore say that L 
is antisymmetric. Here is the general definition. 


Definition 4.4.1. Suppose R is a relation on a set A. Then R is said to be 
antisymmetric if Vx E A Vy E A((xRy A yRx) > x= y). 


We have already seen a relation with many of the same properties as L. 
Look again at the relation S defined in part 1 of Example 4.3.1. Recall that 
in that example we let A = {1, 2}, B= AA), and S = {(x,y) € Bx B|xS& 
y}. Thus, if x and y are elements of B, then xSy means x © y. We checked in 
the last section that S is reflexive and transitive, but not symmetric. In fact, 
S is also antisymmetric, because for any sets x and y, if x S y and y © x 
then x = y. You may find it useful to look back at Figure 4.3 in the last 
section, which shows the directed graph representing S. 

Intuitively, L and S are both relations that have something to do with 
comparing the sizes of two objects. Each of the statements x < y and x S y 
can be thought of as saying that, in some sense, y is “at least as large as” x. 
You might say that each of these statements specifies what order x and y 
come in. This motivates the following definition. 


Definition 4.4.2. Suppose R is a relation on a set A. Then R is called a 
partial order on A(or just a partial order if A is clear from context) if it is 
reflexive, transitive, and antisymmetric. It is called a total order on A (or 
just a total order) if it is a partial order, and in addition it has the following 


property: 
Vx € A Vy E A(xRy V yRx). 


The relations L and S just considered are both partial orders. S is not a 
total order, because it is not true that Vx E BVy E B(x © y V y © x). For 
example, if we let x = {1} and y = {2}, then x É y and y É x. Thus, 
although we can think of the relation S as indicating a sense in which one 
element of B might be at least as large as another, it does not give us a way 
of comparing every pair of elements of B. For some pairs, such as {1} and 
{2}, S doesn’t pick out either one as being at least as large as the other. This 
is the sense in which the ordering is partial. On the other hand, L is a total 
order, because if x and y are any two real numbers, then either x < y ory < x. 
Thus, L does give us a way of comparing any two real numbers. 


Example 4.4.3. Which of the following relations are partial orders? Which 
are total orders? 


1. Let A be any set, and let B= A(A) and S = {(x, y) EB x B|x Gy}. 
2. LetA= {1, 2} and B= “(A) as before. Let 


R = {(x,y) € B x B | y has at least as many elements as x} 
= {(2, Z), (Ø, {1}), (S, {2}. (2. {1,2}), {1}, (1p), d1}, {2}. 


({1}, (1, 2}), d2}, (1), {2}, 12). d2}. (1.2), 1, 2}, (1. 2p}. 
3. D= {(x, y) E Z x Z* | x divides y}. 
4. G={(x% yY € Rx Rx 2y}. 


Solutions 


1. This is just a generalization of one of the examples discussed earlier, 
and it is easy to check that it is a partial order. As long as A has at 
least two elements, it will not be a total order. To see why, just note 


that if a and b are distinct elements of A, then {a} and {b} are 
elements of B for which {a} Æ {b} and {b} Æ {a}. 


Note that ({1}, {2}) © R and ({2}, {1}) © R, but of course {1} # 
{2}. Thus, R is not antisymmetric, so it is not a partial order. 
Although R was defined by picking out pairs (x, y) in which y is, ina 
certain sense, at least as large as x, it does not satisfy the definition of 
partial order. This example shows that our description of partial 
orders as relations that indicate a sense in which one object is at least 
as large as another should not be taken too seriously. This was the 
motivation for the definition of partial order, but it is not the definition 
itself. 


Clearly every positive integer is divisible by itself, so D is reflexive. 
Also, as we showed in Theorem 3.3.7, if x | y and y | z then x | z. Thus, 
if (x, y) E D and Q, z) E D then (x, z) E D, so D is transitive. Finally, 
suppose (x, y) E D and (y, x) E D. Then x | y and y | x, and because x 
and y are positive it follows that x < y and y < x, so x = y. Thus, D is 
antisymmetric, so it is a partial order. It is easy to find examples 
illustrating that D is not a total order. For example, (3, 5) € D and (5, 
3) ÈD. 

Perhaps you were surprised to discover that D is a partial order. It 
doesn’t seem to involve comparing the sizes of things, like the other 
partial orders we’ve seen. But we have shown that it does share with 
these other relations the important properties of reflexivity, 
transitivity, and antisymmetry. In fact, this is one of the reasons for 
formulating definitions such as Definition 4.4.2. They help us to see 
similarities between things that, on the surface, might not seem 
similar at all. 


You should be able to check for yourself that G is a total order. Notice 
that in this case it seems more reasonable to think of xGy as meaning 
that y is as least as small as x rather than at least as large. The 
definition of partial order, though motivated by thinking about 
orderings that go in one direction, actually applies to orderings in 
either direction. In fact, this example might lead you to conjecture that 
if R is a partial order on A, then so is R™t. You are asked to prove this 
conjecture in exercise 13. 


So far we have always used letters as the names for our relations, but 
some-times mathematicians represent relations with symbols rather than 
letters. For example, in part 4 of Example 4.4.3 we used the letter G as the 
name for a relation. But in that example, for all real numbers x and y, xGy 
meant the same thing as x = y. This suggests that we didn’t really need to 
introduce the letter G; we could simply have treated the symbol = as the 
name for the relation. Using this notation, we could say that > is a total 
order on R. 


Here’s another example of a partial order. Let A be the set of all words in 
English, and let R = {(x, y) E A x A | all the letters in the word x appear, 
consecutively and in the right order, in the word y}. For example, (can, 
cannot), (tar, start), and (ball, ball) are all elements of R, but (can, anchor) 
and (can, carnival) are not. You should be able to check that R is reflexive, 
transitive, and antisymmetric, so R is a partial order. Now consider the set B 
= {me, men, tame, mental} © A. Clearly many ordered pairs of words in B 
are in the relation R, but note in particular that the ordered pairs (me, me), 
(me, men), (me, tame), and (me, mental) are all in R. If we think of xRy as 
meaning that y is in some sense at least as large as x, then we could say that 
the word me is the smallest element of B, in the sense that it is smaller than 
everything else in the set. 

Not every set of words will have an element that is smallest in this sense. 
For example, consider the set C = {a, me, men, tame, mental} © A. Each of 
the words men, tame, and mental is larger than at least one other word in the 
set, but neither a nor me is larger than anything else in the set. We’ll call a 
and me minimal elements of C. But note that neither a nor me is the 
smallest element of C in the sense described in the last paragraph, because 
neither is smaller than the other. The set C has two minimal elements but no 
smallest element. 

These examples might raise a number of questions in your mind about 
smallest and minimal elements. The set C has two minimal elements, but B 
has only one smallest element. Can a set ever have more than one smallest 
element? Until we have settled this question, we should only talk about an 
object being a smallest element of a set, rather than the smallest element. If 
a set has only one minimal element, must it be a smallest element? Can a 
set have a smallest element and a minimal element that are different? Would 
the answers to these questions be different if we restricted our attention to 


total orders rather than all partial orders? Before we try to answer any of 
these questions, we should state the definitions of the terms smallest and 
minimal more carefully. 


Definition 4.4.4. Suppose R is a partial order on a set A, B © A, and b € B. 
Then b is called an R-smallest element of B (or just a smallest element if R 
is clear from the context) if Vx E B(bRx). It is called an R-minimal element 
(or just a minimal element) if Ax E B(xRb A x = b). 


Example 4.4.5. 


1. 


Let L = {(x, y) E R x R| x < y}, as before. Let B= {x € R| x= 7}. 
Does B have any L-smallest or L-minimal elements? What about the 
set C = {x © R| x > 7}? As mentioned earlier, we could do without 
the letter L here and ask for <-smallest or <-minimal elements of B 
and C. 


Let D be the divisibility relation defined in part 3 of Example 4.4.3. 
Let B = {3, 4, 5, 6, 7, 8, 9}. Does B have any D-smallest or D- 
minimal elements? 

Let S = {(X, Y) E A(N) x AN) | X © Y}, which is a partial order on 
the set P(N). Let F= {X E AN) | 2 E X and 3 € X}. Note that the 
elements of F are not natural numbers, but sets of natural numbers. 
For example, {1, 2, 3} and {n € N | n is prime} are both elements of 
Z: Does “have any S-smallest or S-minimal elements? What about 


the set G = {X E A(N) | either 2 E X or 3 E X}? 


Solutions 


1. 


Clearly 7 < x for every x E B, so Vx E B(7Lx) and therefore 7 is a 
smallest element of B. It is also a minimal element, since nothing in B 
is smaller than 7, so ~dx € B(xL7 A x # 7). There are no other 
smallest or minimal elements. Note that 7 is not a smallest or minimal 
element of C, since 7 € C. According to Definition 4.4.4, a smallest 
or minimal element of a set must actually be an element of the set. In 
fact, C has no smallest or minimal elements. 


First of all, note that 6 and 9 are not minimal because both are 
divisible by 3, and 8 is not minimal because it is divisible by 4. All 
the other elements of B are minimal elements, but none is a smallest 
element. 


The set {2, 3} is a smallest element of 7 since 2 and 3 are elements 
of every set in Z and therefore VX E A{2, 3} S X). It is also a 
minimal element, since no other element of 7 is a subset of it, and 
there are no other smallest or minimal elements. The set G has two 
minimal elements, {2} and {3}. Every other set in must contain one 


of these two as a subset, so no other set can be minimal. Neither set is 
smallest, since neither is a subset of the other. 


We are now ready to answer some of the questions we raised before 
Definition 4.4.4. 


Theorem 4.4.6. Suppose R is a partial order on a set A, and BC A. 


1. 


If B has a smallest element, then this smallest element is unique. Thus, 
we can speak of the smallest element of B rather than a smallest 
element. 


Suppose b is the smallest element of B. Then b is also a minimal 
element of B, and it is the only minimal element. 


If R is a total order and b is a minimal element of B, then b is the 
smallest element of B. 


Scratch work 


These proofs are somewhat harder than earlier ones in this chapter, so we 
do some scratch work before the proofs. 


1. Of course, we start by assuming that B has a smallest element, and 


because this is an existential statement, we immediately introduce a 
name, say b, for a smallest element of B. We must prove that b is the 
only smallest element. As we saw in Section 3.6, this can be written 
Vc(c is a smallest element of B > b = c), so our next step should be 


to let c be arbitrary, assume it is also a smallest element, and prove b 
=c. 


At this point, we don’t know much about b and c. We know they’re both 
elements of B, but we don’t even know what kinds of objects are in B — 
whether they’re numbers, or sets, or some other type of object — so this 
doesn’t help us much in deciding how to prove that b = c. The only other 
fact we know about b and c is that they are both smallest elements of B, 
which means Vx € B(bRx) and Vx € B(cRx). The most promising way to 
use these statements is to plug something in for x in each statement. What 
we plug in should be an element of B, and we only know of two elements of 
B at this point, b and c. Plugging in both of them in both statements, we get 
bRb, bRc, cRb, and cRc. Of course, we already knew bRb and cRc, since R 
is reflexive. But when you see that bRc and cRb, you should think of 
antisymmetry. Since R is a partial order, it is antisymmetric, so from bRc 
and cRb it follows that b = c. 


2. Our first goal is to prove that b is a minimal element of B, which 
means ~dx € B(xRb A x Z b). Because this is a negative statement, it 
might help to reexpress it as an equivalent positive statement: 


adv E B(xRb A x Æ b) iff Vx e Ba(x Rb AX Fb) 
iff Yx € B(~xRb w x =b) 
iff Yx € B(x Rb > x =b). 


Thus, to prove that b is minimal we could let x be an arbitrary element of B, 
assume that xRb, and prove x = b. 

Once again, it’s a good idea to take stock of what we know at this point 
about b and x. We know xRb, and we know that b is the smallest element of 
B, which means Vx € B(bRx). If we apply this last fact to our arbitrary x, 
then as in part 1 we can use antisymmetry to complete the proof. 

We still must prove that b is the only minimal element, and as in part 1 
this means Vc(c is a minimal element of B > b = c). So we let c be 
arbitrary and assume that c is a minimal element of B, and we must prove 
that b = c. The assumption that c is a minimal element of B means that c © 
B and = x E B(xRc A x = c), but as before, we can reexpress this last 
statement in the equivalent positive form Vx E€ B(xRc > x = c). To use this 


statement we should plug in something for x, and because our goal is to 
show that b = c, plugging in b for x seems like a good idea. This gives us 
bRc > b = c, so if only we could show bRc, we could complete the proof 
by using modus ponens to conclude that b = c. But we know b is the 
smallest element of B, so of course bRc is true. 


3. Of course, we start by assuming that R is a total order and b is a 
minimal element of B. We must prove that b is the smallest element of 
B, which means Vx € B(bRx), so we let x be an arbitrary element of B 
and try to prove bRx. 


We know from examples we’ve looked at that minimal elements in 
partial orders are not always smallest elements, so the assumption 
that R is a total order must be crucial. The assumption that R is total 
means Vx € AVy E A(xRy V yRx), so to use it we should plug in 
something for x and y. The only likely candidates for what to plug in 
are b and our arbitrary object x, and plugging these in we get xRb V 
bRx. Our goal is bRx, so this certainly looks like progress. If only we 
could rule out the possibility that xRb, we’d be done. So let’s see if 
we can prove 7=xRb. 


Because this is a negative statement, we try proof by contradiction. 
Suppose xRb. What given statement can we contradict? The only 
given we haven’t used yet is the fact that b is minimal, and since this 
is a negative statement, it is the natural place to look for a 
contradiction. To contradict the fact that b is minimal, we should try 
to show that dx E B(xRb A x # b). But we’ve already assumed xRb, 
so if we could show x # b we’d be done. 


You should try proving x = b at this point. You won’t get anywhere. 
The fact is, we started out by letting x be an arbitrary element of B, 
and this means that it could be any element of B, including b. We then 
assumed that xRb, but since R is reflexive, this still doesn’t rule out 
the possibility that x = b. There really isn’t any hope of proving x = b. 
We seem to be stuck. 

Let’s review our overall plan for the proof. We needed to show Vx 
€ B(bRx), so we let x be an arbitrary element of B, and we’re trying 
to show bRx. We’ve now run into problems because of the possibility 
that x = b. But if our ultimate goal is to prove bRx, then the possibility 


that x = b really isn’t a problem after all. Since R is reflexive, if x = b 
then of course bRx will be true! 


Now, how should we structure the final write-up of the proof? It 
appears that our reasoning to establish bRx will have to be different 
depending on whether or not x = b. This suggests proof by cases. In 
case 1 we assume that x = b, and use the fact that R is reflexive to 
complete the proof. In case 2 we assume that x = b, and then we can 
use our original line of attack, starting with the fact that R is total. 


Proof. 


1. Suppose b is a smallest element of B, and suppose c is also a smallest 
element of B. Since b is a smallest element, Vx © B(bRx), so in 
particular bRc. Similarly, since c is a smallest element,cRb. But now 
since R is a partial order, it must be antisymmetric, so from bRc and 
cRb we can conclude b = c. 


2. Let x be an arbitrary element of B and suppose that xRb. Since b is the 
smallest element of B, we must have bRx, and now by antisymmetry it 
follows that x = b. Thus, there can be no x € B such that xRb and x # 
b, so b is a minimal element. 


To see that it is the only one, suppose c is also a minimal element. 
Since b is the smallest element of B, bRc. But then since c is minimal 
we must have b = c. Thus b is the only minimal element of B. 


3. Suppose R is a total order and b is a minimal element of B. Let x be an 
arbitrary element of B. If x = b, then since R is reflexive, bRx. Now 
suppose x ~ b. Since R is a total order, we know that either xRb or 
bRx. But xRb can’t be true, since by combining xRb with our 
assumption that x # b we could conclude that b is not minimal, 
thereby contradicting our assumption that it is minimal. Thus, bRx 
must be true. Since x was arbitrary, we can conclude that Vx € 
B(bRx), so b is the smallest element of B. 

L 


When comparing subsets of some set A, mathematicians often use the 
partial order S = {(X, Y) E AA) x MA) | X S Y}, although this is not 
always made explicit. Recall that if FE A(A) and X € Z; then according 


to Definition 4.4.4, X is the S-smallest element of 7 iff VY E AX € Y). In 
other words, to say that an element of 7 is the smallest element means that 
it is a subset of every element of .~ Similarly, mathematicians sometimes 


talk of a set being the smallest one with a certain property. Generally this 
means that the set has the property in question, and furthermore it is a 
subset of every set that has the property. For example, we might describe 
our conclusion in part 3 of Example 4.4.5 by saying that {2, 3} is the 
smallest set X © N with the property that 2 E X and 3 € X. We will see 


more examples of this idea in later chapters. 


Example 4.4.7. 
1. Find the smallest set of real numbers X such that 5 € X and for all 
real numbers x and y, if x E X and x < y then y E X. 
2. Find the smallest set of real numbers X such that X # © and for all 
real numbers x and y, if x E X and x < y then y E X. 
Solutions 
1. Another way to phrase the question would be to say that we are 
looking for the smallest element of the family of sets F= {X © R|5 


E X and VxV y((x E X A x < y) > y E X)}, where it is understood 
that smallest means smallest with respect to the subset partial order. 
Now for any set x E Z we know that 5 E X, and we know that 


VxVy((x © X Ax< y) > y © X). In particular, since 5 € X we can 
say that Vy(5 < y > y © X). Thus, if we let A = {y E R | 5 < y}, then 


we can conclude that VX E AACX). But it is easy to see that A E Æ 


so A is the smallest element of 7 


2. We must find the smallest element of the family of sets F= {X E R | 


X ZØ and VxVy((x EX Ax <y) > y E X)}. The st A= {y E R|5 
< y} from part 1 is an element of 7, but it is not the smallest element, 


or even a minimal element, because the set A = {y E R | 6 < y} is 


smaller — in other words, A © A and A = A. But A is also not the 
smallest element, since A = {y E R | 7 < y} is still smaller. In fact, 


this family has no smallest, or even minimal, element. You’re asked to 
verify this in exercise 12. This example shows that we must be careful 
when talking about the smallest set with some property. There may be 
no such smallest set! 


You have probably already guessed how to define maximal and largest 
elements in partially ordered sets. Suppose R is a partial order on A, B € A, 
and b € B. We say that b is the largest element of B if Vx E B(xRb), and it 
is a maximal element of B if =dx © B(bRx A b # x). Of course, these 
definitions are quite similar to the ones in Definition 4.4.4. You are asked in 
exercise 14 to work out some of the connections among these ideas. 
Another useful related idea is the concept of an upper or lower bound for a 
set. 


Definition 4.4.8. Suppose R is a partial order on A, B € A, anda E A. Then 
a is called a lower bound for B if Vx © B(aRx). Similarly, it is an upper 
bound for B if Vx E B(xRa). 


Note that a lower bound for B need not be an element of B. This is the 
only difference between lower bounds and smallest elements. A smallest 
element of B is just a lower bound that is also an element of B. For 
example, in part 1 of Example 4.4.5, we concluded that 7 was not a smallest 
element of the set C = {x E R | x > 7} because 7 € C. But 7 is a lower 


bound for C. In fact, so is every real number smaller than 7, but not any 
number larger than 7. Thus, the set of all lower bounds of C is the set {x © 
IR |x < 7}, and 7 is its largest element. We say that 7 is the greatest lower 


bound of the set C. 


Definition 4.4.9. Suppose R is a partial order on A and B & A. Let U be the 
set of all upper bounds for B, and let L be the set of all lower bounds. If U 
has a smallest element, then this smallest element is called the least upper 
bound of B. If L has a largest element, then this largest element is called the 
greatest lower boundofB. The phrases least upper bound and greatest lower 
bound are sometimes abbreviated l.u.b. and g.1.b. 


Example 4.4.10. 


1. Let L= {(x, y) © R x R|x<y}, a total order on R. Let B= {1m |n E 
Z*} = {1, 1/2, 1⁄3, 1/4, 15, ...} S R. Does B have any upper or lower 
bounds? Does it have a least upper bound or greatest lower bound? 


2. Let A be the set of all English words, and let R be the partial order on 
A described after Example 4.4.3. Let B = {hold, up}. Does B have any 
upper or lower bounds? Does it have a least upper bound or a greatest 
lower bound? 


Solutions 


1. Clearly the largest element of B is 1. It is also an upper bound for B, 
as is any number larger than 1. By definition, an upper bound for B 
must be at least as large as every element of B, so in particular it must 
be at least as large as 1. Thus, no number smaller than 1 is an upper 
bound for B, so the set of upper bounds for B is {x E R |x = 1}. 


Clearly the smallest element of this set is 1, so 1 is the L.u.b. of B. 


Clearly 0 is a lower bound for B, as is any negative number. On the 
other hand, suppose a is a positive number. Then for a large enough 
integer n we will have 1/n < a. (You should convince yourself that 
any integer n larger than 1/a would do.) Thus, it is not the case that 
Vx © B(a < x), and therefore a is not a lower bound for B. So the set 
of all lower bounds for B is {x E R | x < 0}, and the g.1.b. of B is 0. 


2. Clearly holdup and uphold are upper bounds for B. In fact, no shorter 
word could be an upper bound, so they are both minimal elements of 
the set of all upper bounds. According to part 2 of Theorem 4.4.6, a 
set that has more than one minimal element can have no smallest 
element, so the set of all upper bounds for B does not have a smallest 
element, and therefore B doesn’t have a least upper bound. 


The words hold and up have no letters in common, so B has no 
lower bounds. 


Notice that in part 1 of Example 4.4.10, the largest element of B also 
turned out to be its least upper bound. You might wonder whether largest 
elements are always least upper bounds and whether smallest elements are 


always greatest lower bounds. You are asked to prove that they are in 
exercise 20. Another interesting fact about this example is that, although B 
did not have a smallest element, it did have a greatest lower bound. This 
was not a coincidence. It is an important fact about the real numbers that 
every nonempty set of real numbers that has a lower bound has a greatest 
lower bound and, similarly, every nonempty set of real numbers that has an 
upper bound has a least upper bound. The proof of this fact is beyond the 
scope of this book, but it is important to realize that it is a special fact about 
the real numbers; it does not apply to all partial orders or even to all total 
orders. For example, the set B in the second part of Example 4.4.10 had 
upper bounds but no least upper bound. 

We end this section by looking once again at how these new concepts 
apply to the subset partial order on (A), for any set A. It turns out that in 
this partial order, least upper bounds and greatest lower bounds are our old 
friends unions and intersections. 


Theorem 4.4.11. Suppose A is a set, FE P(A), and FZ ©. Then the least 


upper bound of Z (in the subset partial order) is ()F and the greatest lower 
bound of Fis N Æ 


Proof. See exercise 23. 


Exercises 


*1. In each case, say whether or not R is a partial order on A. If so, is ita 
total order? 


(a) A= {a,b, c}, R= {(a,a), (b, a), (b, b), (b, c), (c, c)}. 

b) A=R, R= {(x%, y) E R x R| |x| < jy}. 

(c) A=R, R= {(x, y) © RX R| |x| < |y| or x = y}. 

2. In each case, say whether or not R is a partial order on A. If so, is it a 
total order? 


(a) A = the set of all words of English, R = {(x, y) E A x A | the word y 
occurs at least as late in alphabetical order as the word x}. 


(b) A = the set of all words of English, R = {(x, y) E A x A | the first letter 
of the word y occurs at least as late in the alphabet as the first letter 
of the word x}. 


(c) A = the set of all countries in the world, R = {(x, y) E A x A | the 


population of the country y is at least as large as the population of the 
country x}. 


3. In each case find all minimal and maximal elements of B. Also find, if 
they exist, the largest and smallest elements of B, and the least upper 
bound and greatest lower bound of B. 


(a) R = the relation shown in the directed graph in Figure 4.6, B = {2, 3, 
4}. 


Qs it) 


Cy 


Figure 4.6. 


b) R={(x, yV ERxR|x<y}, B={xER|1 <x < 2}. 

(c) R= {(x, y) E ANSAN) x S y}, B= {x E AN) x has at most 5 
elements}. 

*4, Suppose R is a relation on A. You might think that R could not be both 


antisymmetric and symmetric, but this isn’t true. Prove that R is both 
antisymmetric and symmetric iff R © i4. 


5. Suppose R is a partial order on A and B € A. Prove that R n (B x B) is 
a partial order on B. 
6. Suppose R, and R, are partial orders on A. For each part, give either a 
proof or a counterexample to justify your answer. 
(a) Must R, N R, be a partial order on A? 


(b) Must R; U R, be a partial order on A? 


7. Suppose R; is a partial order on Aj, R, is a partial order on A,, and A, 
n A =Ø. 

(a) Prove that R, U R, is a partial order on A, U A>. 

(b) Prove that R4 U R U (A, x A») is a partial order on A; U Ab. 

(c) Suppose that R, and R, are total orders. Are the partial orders in parts 
(a) and (b) also total orders? 

*8. Suppose R is a partial order on A and S is a partial order on B. 
Define a relation T on A x B as follows: T = {((a, b), (a', b')) E (A x 
B) x (A x B) | aRa’ and bSb'}. Show that T is a partial order on A x B. 
If both R and S are total orders, will T also be a total order? 

9. Suppose R is a partial order on A and S is a partial order on B. Define 
a relation L on A x B as follows: L = {((a, b), (a', b) E (A x B) x (A 
x B) | aRa', and if a = a’ then bSb’}. Show that L is a partial order on 
A x B. If both R and S are total orders, will L also be a total order? 

10. Suppose R is a partial order on A. For each x € A, let P, = {a EA | 
aRx}. Prove that Vx E A Vy E A(xRy  P, S Py). 

*11. Let D be the divisibility relation defined in part 3 of Example 4.4.3. 
Let B= { x € Z| x > 1}. Does B have any minimal elements? If so, 
what are they? Does B have a smallest element? If so, what is it? 

12. Show that, as was stated in part 2 of Example 4.4.7, {X S R | X # © 
and VxVy((x E X A x < y) > y E X)} has no minimal element. 

13. Suppose R is a partial order on A. Prove that R™! is also a partial order 
on A. If R is a total order, will R™t also be a total order? 

*14. Suppose R is a partial order on A, B € A, and b €E B. Exercise 13 
shows that R™! is also a partial order on A. 

(a) Prove that b is the R-largest element of B iff it is the R ‘-smallest 
element of B. 

(b) Prove that b is an R-maximal element of B iff it is an R-'-minimal 
element of B. 

15. Suppose R} and R, are partial orders on A, R; © R,, BS A, and b E€ 
B. 


(a) Prove that if b is the R,-smallest element of B, then it is also the R3- 


smallest element of B. 

(b) Prove that if b is an R,-minimal element of B, then it is also an R,- 
minimal element of B. 

16. Suppose R is a partial order on A, B © A, and b €E B. Prove that if b is 
the largest element of B, then b is also a maximal element of B, and 
it’s the only maximal element. 

*17. If a subset of a partially ordered set has exactly one minimal 
element, must that element be a smallest element? Give either a proof 
or a counterexample to justify your answer. 


18. Suppose R is a partial order on A, B} S A, B, S A, Vx E By Ay €E 
B,(xRy), and Vx E B,Jy E B,(xRy). 


(a) Prove that for all x E A, x is an upper bound of B, iff x is an upper 
bound of B». 

(b) Prove that if B, and B, are disjoint then neither of them has a maximal 
element. 


19. Consider the following putative theorem. 


Theorem? Suppose R is a total order on A and B © A. Then every 
element of B is either the smallest element of B or the largest element of 
B. 


(a) What’s wrong with the following proof of the theorem? 


Proof. Suppose b € B. Let x be an arbitrary element of B. Since R is a 
total order, either bRx or xRb. 

Case 1. bRx. Since x was arbitrary, we can conclude that Vx E€ B(bRx), 
so b is the smallest element of R. 

Case 2. xRb. Since x was arbitrary, we can conclude that Vx © B(xRb), 
so b is the largest element of R. 

Thus, b is either the smallest element of B or the largest element of B. 
Since b was arbitrary, every element of B is either its smallest element or 
its largest element. 

(b) Is the theorem correct? Justify your answer with either a proof or a 
counterexample. 


20. Suppose R is a partial order on A, B © A, and b € B. 


(a) Prove that if b is the smallest element of B, then it is also the greatest 
lower bound of B. 


(b) Prove that if b is the largest element of B, then it is also the least upper 
bound of B. 


*21. Suppose R is a partial order on A and B & A. Let U be the set of all 
upper bounds for B. 


(a) Prove that U is closed upward; that is, prove that if x © U and xky, 
then y € U. 

(b) Prove that every element of B is a lower bound for U. 

(c) Prove that if x is the greatest lower bound of U, then x is the least 
upper bound of B. 


22. Suppose that R is a partial order on A, B, S A, B, S A, x; is the least 
upper bound of B}, and x, is the least upper bound of B». Prove that if 
B, S B, then x, Rx. 

23. Prove Theorem 4.4.11. 

*24. Suppose Ris a relation on A. Let S = R U R. 


(a) Show that S is a symmetric relation on A and R € S. 
(b) Show that if T is a symmetric relation on A and R € T then S € T. 


Note that this exercise shows that S is the smallest element of the set F = 


{TSAxA]|RS T and T is symmetric}; in other words, it is the smallest 
symmetric relation on A that contains R as a subset. The relation S is 
called the symmetric closure of R. 


25. Suppose that R is a relation on A. Let F= {TSAxA]|RS TandT 
is transitive}. 
(a) Show that 74 ©. 


(b) Show that ()7is a transitive relation on A and R € (QF. 
(c) Show that Fis the smallest transitive relation on A that contains R 


as a Subset. The relation {].7is called the transitive closure of R. 


26. Suppose R, and R, are relations on A and R, © Rb}. 


(a) Let S,; and S, be the symmetric closures of R, and R>, respectively. 
Prove that S4 © S>. (See exercise 24 for the definition of symmetric 
closure.) 

(b) Let T, and T, be the transitive closures of R, and R,, respectively. 
Prove that T} S T. (See exercise 25 for the definition of transitive 
closure.) 


*27. Suppose R, and R, are relations on A, and let R= R4 U R3. 


(a) Let S,, S, and S be the symmetric closures of R, R, and R, 
respectively. Prove that S} U S, = S. (See exercise 24 for the 
definition of symmetric closure.) 

(b) Let T}, T,, and T be the transitive closures of R,, R, and R, 
respectively. Prove that T} U T, S T, and give an example to show 
that it may happen that T} U T, # T. (See exercise 25 for the 
definition of transitive closure.) 

28. Suppose A is a Set. 


(a) Prove that if A has at least two elements then there is no largest 
antisymmetric relation on A. In other words, there is no relation R on 
A such that R is antisymmetric, and for every antisymmetric relation 
SonA,SECR. 

(b) Suppose R is a total order on A. Prove that R is a maximal anti- 
symmetric relation on A. In other words, there is no antisymmetric 
relation S on A such that R S S and R#S. 


29. Suppose R is a relation on A. We say that R is irreflexive if Vx © 
A((x, x) É R). R is called a strict partial order on A if it is irreflexive 
and transitive. It is called a strict total order if it is a strict partial 
order and in addition Vx E AVy E A(xRy V yRx V x = y). (Note that 
the terminology here is somewhat misleading, because a strict partial 
order isn’t a special kind of partial order. It’s not a partial order at all, 
since it’s not reflexive!) 


(a) Let L = {(x, y) © R x R| x < y}. Show that L is a strict total order on 
R. 


(b) Show that if R is a partial order on A, then R \ i, is a strict partial order 
on A, and if R is a total order on A, then R \ i, is a strict total order on 


A. 
(c) Show that if R is a strict partial order on A, then R U i, is a partial 


order on A, and if R is a strict total order on A, then R U i, is a total 
order on A. 


30. Suppose R is a relation on A, and let T be the transitive closure of R. 
Prove that if R is symmetric, then so is T. (Hint: Assume that R is 
symmetric. Prove that R © Tt and T! is transitive. What can you 
conclude about T and Tt? See exercise 25 for the definition of 
transitive closure.) 


4.5 Equivalence Relations 


We saw in Example 4.3.3 that the identity relation i, on any set A is 


always reflexive, symmetric, and transitive. Relations with this 
combination of properties come up often in mathematics, and they have 
some important properties that we will investigate in this section. These 
relations are called equivalence relations. 


Definition 4.5.1. Suppose R is a relation on a set A. Then R is called an 
equivalence relation on A(or just an equivalence relation if A is clear from 
context) if it is reflexive, symmetric, and transitive. 


As we observed earlier, the identity relation i, on a set A is an 


equivalence relation. For another example, let T be the set of all triangles, 
and let C be the relation of congruence of triangles. In other words, C = 
{(s, t) ET x T | the triangle s is congruent to the triangle t}. (Recall that a 
triangle is congruent to another if it can be moved without distorting it so 
that it coincides with the other.) Clearly every triangle is congruent to 
itself, so C is reflexive. Also, if triangle s is congruent to triangle t, then t 
is congruent to s, so C is symmetric; and if r is congruent to s and s is 
congruent to t, then r is congruent to t, so C is transitive. Thus, C is an 
equivalence relation on T. 


As another example, let P be the set of all people, and let B = {(p, q) © 
P x P | the person p has the same birthday as the person q}. (By “same 
birthday” we mean same month and day, but not necessarily the same 
year.) Everyone has the same birthday as himself or herself, so B is 
reflexive. If p has the same birthday as q, then q has the same birthday as 
p, so B is symmetric. And if p has the same birthday as q and q has the 
same birthday as r, then p has the same birthday as r, so B is transitive. 
Therefore B is an equivalence relation. 

It may be instructive to look at the relation B more closely. We can 
think of this relation as splitting the set P of all people into 366 
categories, one for each possible birthday. (Remember, some people were 
bom on February 29!) An ordered pair of people will be an element of B 
if the people come from the same category, but will not be an element of 
B if the people come from different categories. We could think of these 
categories as forming a family of subsets of P, which we could write as an 
indexed family as follows. First of all, let D be the set of all possible 
birthdays. In other words, D = {Jan. 1, Jan. 2, Jan. 3, ..., Dec. 30, Dec. 
31}. Now for each d E D, let P; = {p €E P | the person p was born on the 


day d}. Then the family Z= {P,|d E D} is an indexed family of subsets 
of P. The elements of Zare called equivalence classes for the relation B, 


and every person is an element of exactly one of these equivalence 
classes. The relation B consists of those pairs (p, q) E P x P such that the 
people p and q are in the same equivalence class. In other words, 


B={(p.q)¢Px P | 3d € D(p € Py andq € Pa)} 
= {(p.q) € P x P | Ad e D((p,q) € Pa x Pa)} 


= LJ P4 x Pq). 


deD 


We will call the family 7a partition of P because it breaks the set P 


into disjoint pieces. It turns out that every equivalence relation on a set A 
determines a partition of A, whose elements are the equivalence classes 
for the equivalence relation. But before we can work out the details of 
why this is true, we must define the terms partition and equivalence class 
more carefully. 


Definition 4.5.2. Suppose A is a set and FS AA). We will say that F is 
pairwise disjoint if every pair of distinct elements of Z are disjoint, or in 
other words VX E AVY E AX =Y > X n Y= Ø). (This concept was 
discussed in exercise 5 of Section 3.6.) Fis called a partition of A if it has 
the following properties: 

1. UF=A. 


2. Fis pairwise disjoint. 


3. Vx € HX ZØ). 


For example, suppose A = {1, 2, 3, 4} and Z= {{2}, {1, 3}, {4}}. Then 
OF = {2} UF {1, 3} U {4} = {1, 2, 3, 4} = A, so F satisfies the first 
clause in the definition of partition. Also, no two sets in 7 have any 
elements in common, so F is pairwise disjoint, and clearly all the sets in 
Zare nonempty. Thus, “is a partition of A. On the other hand, the family 
G= {{1, 2}, {1, 3}, {4}} is not pairwise disjoint, because {1, 2} ^ {1, 3} 
= {1} 4 ©, so it is not a partition of A. The family H = {©, {2}, {1, 3}, 
{4}} is also not a partition of A, because it fails on the third requirement 


in the definition. 


Definition 4.5.3. Suppose R is an equivalence relation on a set A, and x © 
A. Then the equivalence class of x with respect to R is the set 


[xXle = {y E A | yRx}. 


If R is clear from context, then we just write [x] instead of [x] The set of 


all equivalence classes of elements of A is called A modulo R, and is 
denoted A/R. Thus, 


AR = {[x]p|x EA}={XSA]|Ix € A(X = [x] p)}. 


In the case of the same-birthday relation B, if p is any person, then 
according to Definition 4.5.3, 


[Pla ={q € P |q Bp} 


= {q € P | the person q has the same birthday as the person p}. 


For example, if John was born on August 10, then 


[John]g = {q € P | the person q has the same birthday as John} 


= {q € P | the person q was born on August 10}. 


In the notation we introduced earlier, this is just the set P4, for d = August 


10. In fact, it should be clear now that for any person p, if we let d be p’s 
birthday, then [p] = Pg. This is in agreement with our earlier statement 


that the sets P4 are the equivalence classes for the equivalence relation B. 


According to Definition 4.5.3, the set of all of these equivalence classes is 
called P modulo B: 


P /B= {[plp|p € P} = {P,|d € D}. 


You are asked to give a more careful proof of this equation in exercise 6. 
As we observed before, this family is a partition of P. 
Let’s consider one more example. Let S be the relation on R defined as 


follows: 
S={(x, V ERxR|x-y EZ}. 


For example, (5.73, 2.73) E S and (-1.27, 2.73) E€ S, since 5.73 — 2.73 = 
3 € Z and -1.27 - 2.73 = -4 € Z, but (1.27, 2.73) € S, since 1.27 - 2.73 


= -1.46 Ė Z. Clearly for any x € R, x -x = 0 € Z, so (x, x) € S, and 
therefore S is reflexive. To see that S is symmetric, suppose (x, y) E S. By 
the definition of S, this means that x — y E Z. But then y - x = -(x - y) E 
Z too, since the negative of any integer is also an integer, so (y, x) € S. 


Because (x, y) was an arbitrary element of S, this shows that S is 
symmetric. Finally, to see that S is transitive, suppose that (x, y) E S and 


(y, z) E S. Then x - y E Zand y - z E Z. Because the sum of any two 
integers is an integer, it follows that x — z = (x — y) + (y - z) € Z, so (x, z) 
€ S, as required. Thus, S is an equivalence relation on R. 


What do the equivalence classes for this equivalence relation look like? 
We have already observed that (5.73, 2.73) © S and (-1.27, 2.73) E S, so 
5.73 © [2.73] and -1.27 € [2.73]. In fact, it is not hard to see what the 
other elements of this equivalence class will be: 


[2.73] = {..., —1.27, —0.27, 0.73, 1.73, 2.73, 3.73, 4.73, 5.73, ...}. 


In other words, the equivalence class contains all positive real numbers of 
the form “._73” and all negative real numbers of the form “—_.27.” In 
general, for any real number x, the equivalence class of x will contain all 
real numbers that differ from x by an integer amount: 


[x] = {..., x- 3,x-2,x-1,x,x+1,x+2,x+3,...}. 


Here are a few facts about these equivalence classes that you might try 
to prove to yourself. As you can see in the last equation, x is always an 
element of [x]. If we choose any number x € [2.73], then [x] will be 
exactly the same as [2.73]. For example, taking x = 4.73 we find that 


[4.73] = {..., -1.27, -0.27, 0.73, 1.73, 2.73, 3.73, 4.73, 5.73, ...} = [2.73]. 


Thus, [4.73] and [2.73] are just two different names for the same set. But 
if we choose x € [2.73], then [x] will be different from [2.73]. For 
example, 


[1.3] = {..., -1.7, -0.7, 0.3, 1.3, 2.3, 3.3, 4.3, ...}. 


In fact, you can see from these equations that [1.3] and [2.73] have no 
elements in common. In other words, [1.3] is actually disjoint from [2.73]. 
In general, for any two real numbers x and y, the equivalence classes [x] 
and [y] are either identical or disjoint. Each equivalence class has many 
different names, but different equivalence classes are disjoint. Because [x] 
always contains x as an element, every equivalence class is nonempty, and 
every real number x is in exactly one equivalence class, namely [x]. In 
other words, the set of all of the equivalence classes, R/S, is a partition of 


IR. This is another illustration of the fact that the equivalence classes 
determined by an equivalence relation always form a partition. 


Theorem 4.5.4. Suppose R is an equivalence relation on a set A. Then 
A/R is a partition of A. 


The proof of Theorem 4.5.4 will be easier to understand if we first 
prove a few facts about equivalence classes. Facts that are proven 
primarily for the purpose of using them to prove a theorem are usually 
called lemmas. 


Lemma 4.5.5. Suppose R is an equivalence relation on A. Then: 


1. For every x E A, x E [x]. 
2. For every x E A and y E A, y € [x] iff ly] = [x]. 
Proof. 


1. Let x € A be arbitrary. Since R is reflexive, xRx. Therefore, by the 
definition of equivalence class, x € [x]. 


2. (>) Suppose y € [x]. Then by the definition of equivalence class, 
yRx. Now suppose z € [y]. Then zRy. Since zRy and yRx, by 
transitivity of R we can conclude that zRx, so z E [x]. Since z was 
arbitrary, this shows that [y] © [x]. 


Now suppose z € [x], so zRx. We already know yRx, and since R is 
symmetric we can conclude that xRy. Applying transitivity to zRx and 
xRy, we can conclude that zRy, so z E [y]. Therefore [x] © [y], so [x] = 
[y]. 

(-) Suppose [y] = [x]. By part 1 we know that y € [y], so since [y] = 
[x], it follows that y E [x]. 

L 


Commentary. 
1. According to the definition of equivalence classes, x E [x] means 
xRx. This is what leads us to apply the fact that R is reflexive. 


2. Of course, the iff form of the goal leads us to prove both directions 
separately. For the > direction, the goal is [y] = [x], and, since [y] 


and [x] are sets, we can prove this by proving [y] © [x] and [x] © 
[y]. We prove each of these statements by the usual method of 
taking an arbitrary element of one set and proving that it is in the 
other. Throughout the proof we use the definition of equivalence 
classes repeatedly, as we did in the proof of statement 1. 


Proof of Theorem 4.5.4. To prove that A/R is a partition of A, we must 
prove the three properties in Definition 4.5.2. For the first, we must show 
that U (A/R)= A, or in other words that U, e alx] = A. Now every 
equivalence class in A/R is a subset of A, so it should be clear that their 
union is also a subset of A. Thus, UJ(A/R)¢ A, so all we need to show to 
finish the proof is that AS L(A/R). To prove this, suppose x E A. Then by 
Lemma 4.5.5, x E [x], and of course [x] E A/R, so x E U(A/R). Thus, U 
(A/R)= A. 

To see that A/R is pairwise disjoint, suppose that X and Y are two 
elements of A/R, and X n Y # ©. By definition of A/R, X and Y are 
equivalence classes, so we must have X = [x] and Y = [y] for some x, y E 
A. Since X N Y 4 Ø, we can choose some z such that z E X n Y= [x] n 
[y]. Now by Lemma 4.5.5, since z € [x] and z € [y], it follows that [x] = 
[z] = [y]. Thus, X = Y. This shows that if X # Y then X n Y= ©, so A/R is 
pairwise disjoint. 

Finally, for the last clause of the definition of partition, suppose X © 
A/R. As before, this means that X = [x] for some x E A. Now by Lemma 
4.5.5, x E [x] = X, so X # ©, as required. 

L 
Commentary. We have given an intuitive reason why U(A/R)¢ A, but if 
you’re not sure why this is correct, you should write out a formal proof. 
(You might also want to look at exercise 16 in Section 3.3.) The proof that 
AS U(A/R)is straightforward. 

The definition of pairwise disjoint suggests that to prove that A/R is 
pairwise disjoint we should let X and Y be arbitrary elements of A/R and 
then prove X # Y > X n Y= Ø. Recall that the statement that a set is 
empty is really a negative statement, so both the antecedent and the 
consequent of this conditional are negative. This suggests that it will 
probably be easier to prove the contrapositive, so we assume X N Y # © 
and prove X = Y. The givens X E A/R, Y E A/R, and X n Y ž © are all 


existential statements, so we use them to introduce the variables x, y, and 
z. Lemma 4.5.5 now takes care of the proof that X = Y as well as the proof 
of the final clause in the definition of partition. 


Theorem 4.5.4 shows that if R is an equivalence relation on A then A/R 
is a partition of A. In fact, it turns out that every partition of A arises in 
this way. 


Theorem 4.5.6. Suppose A is a set and F is a partition of A. Then there is 


an equivalence relation R on A such that A/R = F: 


Before proving this theorem, it might be worthwhile to discuss the 
strategy for the proof briefly. Because the conclusion of the theorem is an 
existential statement, we should try to find an equivalence relation R such 
that A/R = * Clearly for different choices of 7 we will need to choose R 


differently, so the definition of R should depend on 7 in some way. 


Looking back at the same-birthday example at the start of this section 
may help you see how to proceed. Recall that in that example the 
equivalence relation B consisted of all pairs of people (p, q) such that p 
and q were in the same set in the partition {P,| d E D}. In fact, we found 
that we could also express this by saying that B = Ugep (Pq * Pa). This 
suggests that in the proof of Theorem 4.5.6 we should let R be the set of 
all pairs (x, y) E A x A such that x and y are in the same set in the 
partition 7 An alternative way to write this would be R = Uye -(X x X). 


For example, consider again the example of a partition given after 
Definition 4.5.2. In that example we had A = {1, 2, 3, 4} and Z= {{2}, 


{1, 3}, {4}}. Now let’s define a relation R on A as suggested in the last 
paragraph. This gives us: 


R= J (X x X) 
XeF 


= ({2} x {2}) U({1, 3} x {1,3})) U ({4} x {4p 
= {(2,2)} U {(1, 1), (1, 3), (3, 1), (3,.3)} U {(4, 4)} 
= {(2,2), (1, 1), (1, 3), (3, 1), (3,3), (4,4)}. 


The directed graph for this relation is shown in Figure 4.7. We will let you 
check that R is an equivalence relation and that the equivalence classes 
are 


RIs ih OSs y= tg 3} Ala 447. 


Thus, the set of all equivalence classes is A/R = {{2}, {1, 3}, {4}}, which 
is precisely the same as the partition 7 we started with. 


Of course, the reasoning that led us to the formula R = Uy3{X X X) not 


be part of the proof of Theorem 4.5.6. When we write the proof, we can 
simply define R in this way and then verify that it is an equivalence 
relation on A and that A/R = # It may make the proof easier to follow if 


we once again prove some lemmas first. 


Figure 4.7. 


Lemma 4.5.7. Suppose A is a set and F is a partition of A. Let R = 
Uxe xX x X). Then R is an equivalence relation on A. We will call R the 
equivalence relation determined by 7 

Proof. We’ll prove that R is reflexive and leave the rest for you to do in 
exercise 8. Let x be an arbitrary element of A. Since Fis a partition of A, 
UF = A, so x © UZ Thus, we can choose some X E “such that x E X. 
But then (x, x) E X x X, so (x, x) E Uye -(X x X) = R. Therefore, R is 


reflexive. 


O 


Commentary. After letting x be an arbitrary element of A, we must prove 
(x, x) E R. Because R = Uyepr (X x X), this means we must prove 3X E 


H (x, x) E X x X), or in other words AX € A(x € X). But this just means 
x © UZ so this suggests using the first clause in the definition of 


partition, which says that U.7= A. 


Lemma 4.5.8. Suppose A is a set and F is a partition of A. Let R be the 
equivalence relation determined by F: Suppose X E Fand x E X. Then 
[Xlp = X. 


Proof. Suppose y €E [X]p. Then (y, x) E R, so by the definition of R there 
must be some Y € such that (y, x) © Y x Y, and therefore y E Y and x 
€ Y. Since x E X and x € Y, X n Y# Ø, and since Zis pairwise disjoint 
it follows that X = Y. Thus, since y € Y, y E X. Since y was an arbitrary 
element of [x]p, we can conclude that [x]p E X. 
Now suppose y E X. Then (y, x) E X x X, so (y, x) E R and therefore y 
€ [xX]p. Thus X E [x]p, so [X]p = X. 
L 
Commentary. To prove [X]p = X we prove [X]p S X and X & [x]p. For the 
first we start with an arbitrary y E [x]p and prove y E X. Writing out the 
definition of [x]R we get (y, x) E R, and since R was defined to be UyE + 


(Y x Y), this means 3 AY E Ay, x) E Y x Y). Of course, since this is an 


existential statement we immediately introduce the new variable Y by 
existential instantiation. Since this gives us y € Y and our goal is y € X, it 
is not surprising that the proof is completed by proving Y = X. 


The proof that X © [x]p also uses the definitions of [x]p and R, but is 
more straightforward. 


Proof of Theorem 4.5.6. Let R = Uxer ((X*X). We have already seen that 
R is an equivalence relation, so we need only check that A/R = 7 To see 
this, suppose X E A/R. This means that X = [x] for some x € A. Since Z 
is a partition, we know that U.Z = A, so x E UZ and therefore we can 
choose some Y € such that x E Y. But then by Lemma 4.6.8, [x] = Y. 
Thus X = Y E Z so ARS Æ 


Now suppose X € Then since Fis a partition, X # ©, so we can 
choose some x € X. Therefore by Lemma 4.6.8, X = [x] E A/R, so F & 
A/R. Thus, A/R = Æ 


L 
Commentary. We prove that A/R = F by proving that A/R © Fand FS 


A/R. For the first, we take an arbitrary X E A/R and prove that X E Æ 


Because X € A/R means Jx € A(X = [x]), we immediately introduce the 
new variable x to stand for an element of A such that X = [x]. The proof 
that x E F now proceeds by the slightly roundabout route of finding a set 


Y E Z such that X = Y. This is motivated by Lemma 4.5.8, which 
suggests a way of showing that an element of F is equal to [x] = X. The 


proof that F & A/R also relies on Lemma 4.5.8. 


We have seen how an equivalence relation R on a set A can be used to 
define a partition A/R of A and also how a partition 7 of A can be used to 


define an equivalence relation Uye -(X x X) on A. The proof of Theorem 


4.5.6 demonstrates an interesting relationship between these operations. If 
you start with a partition 7 of A, use 7 to define the equivalence relation 


R = Ugyes (X x X), and then use R to define a partition A/R, then you end 


up back where you started. In other words, the final partition A/R is the 
same as the original partition F. You might wonder if the same idea would 


work in the other order. In other words, suppose you start with an 


equivalence relation R on A, use R to define a partition A= A/R, and then 
use .F to define an equivalence relation S = Uye -(X x X). Would the final 
equivalence relation S be the same as the original equivalence relation R? 
You are asked in exercise 10 to show that the answer is yes. 

We end this section by considering a few more examples of 


equivalence relations. A very useful family of equivalence relations is 
given by the next definition. 


Definition 4.5.9. Suppose m is a positive integer. For any integers x and y, 
we will say that x is congruent to y modulo m if 3k E Z(x - y = km). In 


other words, x is congruent to y modulo m iff m | (x — y). We will use the 
notation x = y (mod m) to mean that x is congruent to y modulo m. 


For example, 12 = 27 (mod 5), since 12 - 27 = -15 = (-3) - 5. Now for 
any positive integer m we can consider the relation {(x, y) EZ x Z|x=y 
(modm)}. As we mentioned in the last section, mathematicians sometimes 
use symbols rather than letters as names of relations. In this case, 
motivated by the notation in Definition 4.5.9, we will use the symbol =,, 


as our name for this relation. Thus, for any integers x and y, x =,, y Means 


the same thing as x = y (mod m). It turns out that this relation is another 
example of an equivalence relation. 


Theorem 4.5.10. For every positive integer m, =,, is an equivalence 
relation on Z. 


Proof. We will check transitivity for =,, and let you check reflexivity and 
symmetry in exercise 11. To see that =,, is transitive, suppose that x =,, y 
and y =,, Z. This means that x = y (mod m) and y = z (mod m), or in other 
words m | (x — y) and m | (y - z). Therefore, by exercise 18(a) in Section 
3.3, m | [(x — y) + (y - z)]. But (x - y) + (y - z) = x - z, so it follows that 
m| (x —z), and therefore x = „Z. 
L 
We will have more to say about these equivalence relations later in this 
book, especially in Chapter 7. 


Equivalence relations often come up when we want to group together 
elements of a set that have something in common. For example, if you’ve 
studied vectors in a previous math course or perhaps in a physics course, 
then you may have been told that vectors can be thought of as arrows. But 
you were probably also told that different arrows that point in the same 
direction and have the same length must be thought of as representing the 
same vector. Here’s a more lucid explanation of the relationship between 
vectors and arrows. Let A be the set of all arrows, and let R = {(x, y) E 
AXA | the arrows x and y point in the same direction and have the same 
length}. We will let you check for yourself that R is an equivalence 
relation on A. Each equivalence class consists of arrows that all have the 
same length and point in the same direction. We can now think of vectors 
as being represented, not by arrows, but by equivalence classes of arrows. 

Students who are familiar with computer programming may be 
interested in our next example. Suppose we let P be the set of all 
computer programs, and for any computer programs p and q we say that p 
and q are equivalent if they always produce the same output when given 
the same input. Let R = {(p, q) © P x P | the programs p and q are 
equivalent}. It is not hard to check that R is an equivalence relation on P. 
The equivalence classes group together programs that produce the same 
output when given the same input. 


Exercises 


*1. Find all partitions of the set A = {1, 2, 3}. 
2. Find all equivalence relations on the set A = {1, 2, 3}. 


*3. Let W = the set of all words in the English language. Which of the 
following relations on W are equivalence relations? For those that 
are equivalence relations, what are the equivalence classes? 


(a) R= {(x, y) © W x W| the words x and y start with the same letter}. 

(b) S= {(x, y) E W x W | the words x and y have at least one letter in 
common}. 

(c) T= {(x, y) E W x W | the words x and y have the same number of 
letters}. 


Which of the following relations on R are equivalence relations? 


For those that are equivalence relations, what are the equivalence 
classes? 


(a) R={(x,y) ERxR|x-y EN}. 
b) S={@% y) ERxR|x-y EQ}. 
(c) T= {(x, V E Rx R| An € Z(y=x: 10}. 


5. 


Let L be the set of all nonvertical lines in the plane. Which of the 
following relations on L are equivalence relations? For those that 
are equivalence relations, what are the equivalence classes? 


(a) R= {(k, D) E L x L | the lines k and l have the same slope}. 
(b) S= {(k, |) E L x L | the lines k and l are perpendicular}. 
(c) T={k D)ELxL|kanax=lnxandkn y= lN y}, where x and y 


are the x-axis and the y-axis. (We are treating lines as sets of points 
here.) 


*6.In the discussion of the same-birthday equivalence relation B 


*10. 


11. 


following Definition 4.5.3, we claimed that P /B = {P4 | d E D}. 


Give a careful proof of this claim. You will find when you work out 
the proof that there is an assumption you must make about people’s 
birthdays (a very reasonable assumption) to make the proof work. 
What is this assumption? 


Let T be the set of all triangles, and let S = {(s, t) © T x T | the 
triangles s and t are similar}. (Recall that two triangles are similar if 
the angles of one triangle are equal to corresponding angles of the 
other.) Verify that S is an equivalence relation. 


Complete the proof of Lemma 4.5.7. 


Suppose R and S are equivalence relations on A and A/R = ASS. 
Prove that R = S. 


Suppose R is an equivalence relation on A. Let 7 = A/R, and let S 
be the equivalence relation determined by Z In other words, S = 


Uve AX x X). Prove that S = R. 


Let =,, be the “congruence modulo m” relation defined in the text, 
for a positive integer m. 


(a) Complete the proof of Theorem 4.5.10 by showing that =,, is 


reflexive and symmetric. 

(b) Find all the equivalence classes for =, and =,. How many equiva- 
lence classes are there in each case? In general how many equiva- 
lence classes do you think there are for =? 

12. Prove that for every integer n, either n? = 0 (mod 4) or n? = 1 (mod 
4). 

*13. Suppose m is a positive integer. Prove that for all integers a, a’, b, 
and b’, if a’ = a (mod m) and b' = b (mod m) then a’ + b' =a + b 
(mod m) and a b = a'b' (mod m). 

14. Suppose that R is an equivalence relation on A and B € A. Let S = 
Rn (B x B). 

(a) Prove that S is an equivalence relation on B. 

(b) Prove that for all x E B, [x]; = [xX]p ^ B. 


15. Suppose B & A, and define a relation R on “(A) as follows: 
R = {(X, Y) E ZA) x HA)| XY S B}. 


(a) Prove that R is an equivalence relation on (A). 

(b) Prove that for every X E P(A) there is exactly one Y E [X]p such that 
Yn B=©@. 

*16. Suppose Fis a partition of A, Gis a partition of B, and A and B are 
disjoint. Prove that 7 U Gis a partition of A U B. 


17. Suppose R is an equivalence relation on A, S is an equivalence 
relation on B, and A and B are disjoint. 


(a) Prove that R U S is an equivalence relation on A U B. 
(b) Prove that for all x E A, [Xlpys = [X]p, and for all y E B, [Y]pys = Lys. 


(c) Prove that (A U B)/(R U S) = (A/R) U (B/S). 
18. Suppose “7 and are partitions of a set A. We define a new family of 
sets Z- Gas follows: 


F: G={Z E AA) | Zz Ø and AX E ZIY E GZ=X N Y)}. 


Prove that Z- Gis a partition of A. 


19. Let F= {R R*, {0}} and G= {Z, R \ Z}, and note that both Aand G 
are partitions of R. List the elements of Z- G. (See exercise 18 for the 
meaning of the notation used here.) 

*20. Suppose R and S are equivalence relations on a set A. Let T= R N S. 


(a) Prove that T is an equivalence relation on A. 

(b) Prove that for all x E A, [x]; = [xX]p N kls. 

(c) Prove that A/T = (A/R) - (A/S). (See exercise 18 for the meaning of the 
notation used here.) 


21. Suppose Fis a partition of A and is a partition of B. We define a new 
family of sets F ® Gas follows: 


F®G={ZE ZA x B) | IX € FIY E AZ =X x Y)}. 


Prove that 7 ® (is a partition of A x B. 


*22. Let F= {R_, R*, {0}}, which is a partition of R. List the elements of 
FOF, and describe them geometrically as subsets of the xy -plane. (See 


exercise 21 for the meaning of the notation used here.) 


23. Suppose R is an equivalence relation on A and S is an equivalence 
relation on B. Define a relation T on A x B as follows: 


T = {((a, b), (a’, b')) E (A x B) x (A x B) | aRa' and bSb'}. 
(a) Prove that T is an equivalence relation on A x B. 


(b) Prove that if a E A and b € B then [(a, b)]r = Lalp x [b]s. 

(c) Prove that (A x B)/T = (A/R) ® (B/S). (See exercise 21 for the meaning 

of the notation used here.) 

*24. Suppose R and S are relations on a set A, and S is an equivalence 
relation. We will say that R is compatible with S if for all x, y, x’, and y' 
in A, if xSx' and ySy' then xRy iff x’ Ry’. 

(a) Prove that if R is compatible with S, then there is a unique relation T on 
A/S such that for all x and y in A, [X]<T [ylg iff xRy. 


(b) 
25. 


(a) 
(b) 


(c) 
26. 
(a) 
(b) 


27. 


(a) 
(b) 


(c) 


Suppose T is a relation on A/S and for all x and y in A, [x]sT [y]s iff xRy. 
Prove that R is compatible with S. 


Suppose R is a relation on A and R is reflexive and transitive. (Such a 
relation is called a preorder on A.) Let S= R n Rt. 


Prove that S is an equivalence relation on A. 

Prove that there is a unique relation T on A/S such that for all x and 
yinA, [x]s T [y]s iff xRy. (Hint: Use exercise 24.) 

Prove that T is a partial order on A/S, where T is the relation from part 
(b). 

Let I = {1, 2, ..., 100}, A= AD, and R = {(X, Y) E A x A | Y has at 
least as many elements as X}. 


Prove that R is a preorder on A. (See exercise 25 for the definition of 
preorder.) 

Let S and T be defined as in exercise 25. Describe the elements of A/S 
and the partial order T. How many elements does A/S have? Is T a total 
order? 


Suppose A is a set. If Zand g are partitions of A, then we’ll say that 7 
refines Gif VX E FAY E GX C Y). Let P be the set of all partitions of 
A, and let R= {( F; G) © P x P| Frefines G}. 


Prove that R is a partial order on P. 
Suppose that S and T are equivalence relations on A. Let A= A/S and G 


= A/T. Prove that S © Tiff Z refines §. 
Suppose Z and ¢ are partitions of A. Prove that Z- Q is the greatest 
lower bound of the set {7 G} in the partial order R. (See exercise 18 


for the meaning of the notation used here.) 


5 


Functions 


5.1 Functions 


Suppose P is the set of all people, and let H = {(p, n) E P x N | the person p 
has n children}. Then H is a relation from P to N, and it has the following 
important property. For every p € P, there is exactly one n E N such that 


(p, n) E H. Mathematicians express this by saying that H is a function from 
P to N. 


Definition 5.1.1. Suppose F is a relation from A to B. Then F is called a 
function from A to B if for every a € A there is exactly one b € B such that 
(a, b) E F. In other words, to say that F is a function from A to B means: 


Va € AJ!b e B((a,b) € F). 
To indicate that F is a function from A to B, we will write F: A > B. 


Example 5.1.2. 


1. LetA = {1, 2, 3}, B = {4, 5, 6}, and F = {(1, 5), (2, 4), (3, 5)}. Is Fa 
function from A to B? 

2. LetA= {1, 2, 3}, B= {4, 5, 6}, and G = {(1, 5), (2, 4), (1, 6)}. Is Ga 
function from A to B? 

3. Let C be the set of all cities and N the set of all countries, and let L = 


{(c, n) E C x N | the city c is in the country n}. Is L a function from C 
to N? 


4. Let P be the set of all people, and let C = {(p, q) E P x P | the person 
p is a parent of the person q}. Is C a function from P to P? 


7. 


Let P be the set of all people, and let D = {(p, x) E P x A(P) | x = the 
set of all children of p}. Is D a function from P to A(P)? 


Let A be any set. Recall that i, = {(a, a) | a E A} is called the identity 
relation on A. Is it a function from A to A? 


Let f = {(x, y) E R x R | y = x°}. Is fa function from R to R? 


Solutions 


1. 


Yes. Note that 1 is paired with 5 in the relation F, but it is not paired 
with any other element of B. Similarly, 2 is paired only with 4, and 3 
with 5. In other words, each element of A appears as the first 
coordinate of exactly one ordered pair in F. Therefore F is a function 
from A to B. Note that the definition of function does not require that 
each element of B be paired with exactly one element of A. Thus, it 
doesn’t matter that 5 occurs as the second coordinate of two different 
pairs in F and that 6 doesn’t occur in any ordered pairs at all. 


No. G fails to be a function from A to B for two reasons. First of all, 3 
isn’t paired with any element of B in the relation G, which violates the 
requirement that every element of A must be paired with some 
element of B. Second, 1 is paired with two different elements of B, 5 
and 6, which violates the requirement that each element of A be paired 
with only one element of B. 


If we make the reasonable assumption that every city is in exactly one 
country, then L is a function from C to N. 


Because some people have no children and some people have more 
than one child, C is not a function from P to P. 


Yes, D is a function from P to “(P). Each person p is paired with 
exactly one set x S P, namely the set of all children of p. Note that in 
the relation D, a person p is paired with the set consisting of all of p ’s 
children, not with the children themselves. Even if p does not have 
exactly one child, it is still true that there is exactly one set that 
contains precisely the children of p and nothing else. 

Yes. Each a € A is paired in the relation i, with exactly one element 
of A, namely a itself. In other words, (a, a) E ig, but for every a'# a, 


a, a") È i,. Thus, we can call i, the identity function on A. 
A A y 


7. Yes. For each real number x there is exactly one value of y, namely y 
= x’, such that (x, y) E f. 


Suppose f: A > B. If a E A, then we know that there is exactly one b E 
B such that (a, b) E f. This unique b is called “the value of f at a,” or “the 
image of a under f,” or “the result of applying f to a,” or just “f of a,” and it 
is written f(a). In other words, for every a E A and b € B, b = f(a) iff (a, b) 
€ f. For example, for the function F = {(1, 5), (2, 4), (3, 5)} in part 1 of 
Example 5.1.2, we could say that F(1) = 5, since (1, 5) © F. Similarly, 
F(2)= 4 and F (3)= 5. If L is the function in part 3 and c is any city, then 
L(c) would be the unique country n such that (c, n) E L. In other words, 
L(c) = the country in which c is located. For example, L(Paris)= France. For 
the function D in part 5, we could say that for any person p, D(p) = the set 
of all children of p. If A is any set and a € A, then (a, a) E i}, so i, (a)=a. 
And if f is the function in part 7, then for every real number x, f(x) =x?. 

A function f from a set A to another set B is often specified by giving a 
rule that can be used to determine f(a) for any a € A. For example, if A is 
the set of all people and B = R*, then we could define a function f from A to 
B by the rule that for every a € A, f(a) = a’s height in inches. Although this 
definition doesn’t say explicitly which ordered pairs are elements of f, we 
can determine this by using our rule that for alla E A and b €E B, (a, b) E f 
iff b=f(a). Thus, 

f={(a.b)€ Ax B|b= fl(a)} 


= {(a,b) € A x B | b = a's height in inches}. 


For example, if Joe Smith is 68 inches tall, then (Joe Smith, 68) € f and f 
(Joe Smith)= 68. 

It is often useful to think of a function f from A to B as representing a rule 
that associates, with each a € A, some corresponding object b = f(a) E B. 
However, it is important to remember that although a function can be 
defined by giving such a rule, it need not be defined in this way. Any subset 
of A x B that satisfies the requirements given in Definition 5.1.1 is a 
function from A to B. 


Example 5.1.3. Here are some more examples of functions defined by 
rules. 


1. Suppose every student is assigned an academic advisor who is a 
professor. Let S be the set of students and P the set of professors. 
Then we can define a function f from S to P by the rule that for every 
student s, f(s) = the advisor of s. In other words, 

f={(s,p)eSx P| p= fi(s)} 
= {(s, p) E€ S x P | the professor p is the academic advisor of 


the student s}. 


2. We can define a function g from Z to R by the rule that for every x © 
Z, g(x)= 2x + 3. Then 
g={(x,.vy)€ZxR| y= g(x)} 
= {(x,y)e€ZxR|y = 2x +3} 


{...,(—2,—1), (—1, 1), 0,3), (1, 5), (2,7)... -}- 


3. Leth be the function from R to R defined by the rule that for every x 
€ R, h(x)= 2 x + 3. Note that the formula for h(x) is the same as the 
formula for g(x) in part 2. However, h and g are not the same 
function. You can see this by noting that, for example, (7, 27+ 3) E h 
but (m, 2x + 3) € g, since x É Z. (For more on the relationship 
between g and h, see exercise 7(c).) 


Notice that when a function f from A to B is specified by giving a rule for 
finding f(a), the rule must determine the value of f(a) for every a € A. 
Sometimes when mathematicians are stating such a rule they don’t say 
explicitly that the rule applies to all a E A. For example, a mathematician 
might say “let f be the function from R to R defined by the formula f(x) = x? 
+ 7.” It is understood in this case that the equation f(x) = x? + 7 applies to all 
x E R even though it hasn’t been said explicitly. This means that you can 
plug in any real number for x in this equation, and the resulting equation 
will be true. For example, you can conclude that f (3) = 32 + 7 = 16. 
Similarly, if w is a real number, then you can write f(w) = w? + 7, or even f 
(2w - 3) = (2 w- 3} + 7 = 4w? - 12 w + 16. 


Because a function f from A to B is completely determined by the rule for 
finding f(a), two functions that are defined by equivalent rules must be 
equal. More precisely, we have the following theorem: 


Theorem 5.1.4. Suppose f and g are functions from A to B. If Va E A(f (a)= 
g(a)), then f = g. 


Proof. Suppose V a € A(f (a) =g(a)), and let (a, b) be an arbitrary element 
of f. Then b = f(a). But by our assumption f(a) = g(a), so b = g(a) and 
therefore (a, b) E g. Thus, f E g. A similar argument shows g € f, so f=g. 


Commentary. Because f and g are sets, we prove f = g by proving f © g and 
g & f. Each of these goals is proven by showing that an arbitrary element of 
one set must be an element of the other. Note that, now that we have proven 
Theorem 5.1.4, we have another method for proving that two functions f 
and g from a set A to another set B are equal. In the future, to prove f = g we 
will usually prove Va € A(f (a) = g(a)) and then apply Theorem 5.1.4. 

Because functions are just relations of a special kind, the concepts intro- 
duced in Chapter 4 for relations can be applied to functions as well. For 
example, suppose f: A > B. Then f is a relation from A to B, so it makes 
sense to talk about the domain of f, which is a subset of A, and the range of 
f, which is a subset of B. According to the definition of function, every 
element of A must appear as the first coordinate of some (in fact, exactly 
one) ordered pair in f, so the domain of f must actually be all of A. But the 
range of f need not be all of B. The elements of the range of f will be the 
second coordinates of all the ordered pairs in f, and the second coordinate of 
an ordered pair in f is what we have called the image of its first coordinate. 
Thus, the range of f could also be described as the set of all images of 
elements of A under f: 


Ran(f) = {f (a) | a €A}. 


For example, for the function f defined in part 1 of Example 5.1.3, Ran (f) = 
{f(s) | s E S} = the set of all advisors of students. 

We can draw diagrams of functions in exactly the same way we drew 
diagrams for relations in Chapter 4. If f: A —> B, then as before, every 
ordered pair (a, b) © f would be represented in the diagram by an edge 
connecting a to b. By the definition of function, every a E A occurs as the 


first coordinate of exactly one ordered pair in f, and the second coordinate 
of this ordered pair is f(a). Thus, for every a € A there will be exactly one 
edge coming from a, and it will connect a to f(a). For example, Figure 5.1 
shows what the diagram for the function L defined in part 3 of Example 
5.1.2 would look like. 


Boston e 
/ 


New York e 


Figure 5.1. 


The definition of composition of relations can also be applied to 
functions. If f: A > B and g: B > C, then f is a relation from A to B and g is 
a relation from B to C, so g ° f will be a relation from A to C. In fact, it turns 
out that g ° f is a function from A to C, as the next theorem shows. 


Theorem 5.1.5. Suppose f: A > B and g: B > C. Then g ° f. A > C, and 
for any a € A, the value of g ° f at a is given by the formula (g ° f)(a) =g(f 
(a)). 


Scratch work 


Before proving this theorem, it might be helpful to discuss the scratch work 
for the proof. According to the definition of function, to show that g ° f: A 
> C we must prove that Va € Ad! c E C((a, c) E g ° f), so we will start 
out by letting a be an arbitrary element of A and then try to prove that 4! c 
E C((a, c) E g of). As we saw in Section 3.6, we can prove this statement 
by proving existence and uniqueness separately. To prove existence, we 
should try to find a c E C such that (a, c) E g ° f. For uniqueness, we 


should assume that (a, c4) E g ° f and (a, c) E g ° f, and then try to prove 
that c4 = C>. 


Proof. Let a be an arbitrary element of A. We must show that there is a 
unique c E C such that (a,c) E g°f. 

Existence: Let b = f(a) E B. Let c = g(b) € C. Then (a, b) € f and (b, c) 
€ g, so by the definition of composition of relations,(a, c) E g ° f. Thus, dc 
€ C((a,c)& g°f). 

Uniqueness: Suppose (a, c4) E g ° f and (a, cy) E g ° f. Then by the 
definition of composition, we can choose b, € B such that (a, b;) E f and 
(b,,c,)© g, and we can also choose b, E B such that (a, b,)€ f and 
(b>,C>)© g. Since f is a function, there can be only one b € B such that (a, 
b) E f. Thus, since (a, b,)and(a, b») are both elements of f, it follows that b, 
= b>. But now applying the same reasoning to g, since (b4, c4) E g and (b4, 
Cy) = (b>, Co) E g, it follows that c4 = cy, as required. 

This completes the proof that g ° f is a function from A to C. Finally, to 
derive the formula for (g ° f)(a), note that we showed in the existence half 


of the proof that for any a E A, if we let b = f(a) and c = g(b), then (a, c) E 
g ° f. Thus, 


(go f)\(a) =c = g(b) = g( f (a)). 


When we first introduced the idea of the composition of two relations in 
Chapter 4, we pointed out that the notation was somewhat peculiar and 
promised to explain the reason for the notation in this chapter. We can now 
provide this explanation. The reason for the notation we’ve used for 
composition of relations is that it leads to the convenient formula (g ° f)(x) = 
g(f(x)) derived in Theorem 5.1.5. Note that because functions are just 
relations of a special kind, everything we have proven about composition of 
relations applies to composition of functions. In particular, by Theorem 
4.2.5, we know that composition of functions is associative. 


Example 5.1.6. Here are some examples of compositions of functions. 


1. Let C and N be the sets of all cities and countries, respectively, and let 
L: C > N be the function defined in part 3 of Example 5.1.2. Thus, 


for every city c, L(c) = the country in which c is located. Let B be the 
set of all buildings located in cities, and define F: B > C by the 
formula F(b)= the city in which the building b is located. Then L ° F: 
B > N. For example, F (Eiffel Tower) = Paris, so according to the 
formula derived in Theorem 5.1.5, 


(L o F)(Eiffel Tower) = L(F (Eiffel Tower)) 


= L(Paris) = France. 


In general for every building b € B, 


(Lo F)(b) = L(F(b)) = L(the city in which b is located) 


= the country in which b is located. 


A diagram of this function is shown in Figure 5.2. 


B C N 


Empire State 
Building . 
U.N. Building « 


> New York United States 
\ 


I France 
Eiffel Tower « 


eL(F(b)) = 
(Le F)(b) 


Figure 5.2. 


Let g: Z > R be the function from part 2 of Example 5.1.3, which 
was defined by the formula g(x) = 2x + 3. Let f: Z > Z be defined by 
the formula f(n) = n? - 3n +1. Then g ° f: Z > R. For example, f (2)= 
22-3- 2 + 1 = -1, so (g °f)(2)=9(f (2))=g(-1)= 1. In general for every 
nEZ, 
(g o f)(n) = g( f (n)) = g(n? — 3n + 1) = 2(n? — 3n + 1) +3 
= 2n? —6n +5. 


Exercises 


*1. (a) Let A = {1, 2, 3}, B = {4}, and f = {(1, 4), (2, 4), (3, 4)}. Is fa 
function from A to B? 

(b) Let A = {1}, B = {2, 3, 4}, and f = {(1, 2), (1, 3), (1, 4)}. Is fa 
function from A to B? 

(c) Let C be the set of all cars registered in your state, and let S be the set 
of all finite sequences of letters and digits. Let L = {(c, s) E C x S | 
the license plate number of the car c is s}. Is L a function from C to 
S? 

2. (a) Let f be the relation represented by the graph in Figure 5.3. Is f a 

function from A to B? 


Figure 5.3. 


(b) Let W be the set of all words of English, and let A be the set of all 
letters of the alphabet. Let f = {(w, a) E W x A | the letter a occurs in 
the word w}, and let g = {(w, a) E W x A | the letter a is the first 
letter of the word w}. Is f a function from W to A? How about g? 

(c) John, Mary, Susan, and Fred go out to dinner and sit at a round table. 
Let P = {John, Mary, Susan, Fred}, and let R = {(p, q) E P x P | the 
person p is sitting immediately to the right of the person q}. Is Ra 
function from P to P? 

*3. (a) Let A= {a, b, c}, B = {a, b}, and f = {(a, b), (b, b), (c, a)}. Then 

f: A > B. What are f(a), f(b), and f(c)? 


(b) Let f: R > R be the function defined by the formula f(x) = x? — 2x. 
What is f (2)? 
(c) Le f={(x, n ERxZ|n<x<n+ 1}. Then f: R > Z. What is f (7)? 
What is f (-71)? 
4. (a) Let N be the set of all countries and C the set of all cities. Let H: 
N> C be the function defined by the rule that for every country 
n, H(n) = the capital of the country n. What is H (Italy)? 
(b) Let A= {1, 2, 3} and B= ZA). Let F: B > B be the function defined 
by the formula F(X) = A \ X. What is F ({1, 3)? 
(c) Let f: R > R x R be the function defined by the formula f(x) = (x + 1, 


x — 1). What is f (2)? 
*5. Let L be the function defined in part 3 of Example 5.1.2 and let H be 
the function defined in exercise 4(a). Describe L ° H and H° L. 
6. Let f and g be functions from R to R defined by the following 
formulas: 


: l 
f(x) = 


e(x) =2x—-1. 
x-+2 


Find formulas for (f ° g)(x) and (g ° f(x). 

*7, Suppose f: A > Band C & A. The set f n(C x B), which is a relation 
from C to B, is called the restriction of f to C, and is sometimes 
denoted f I C. In other words, 


7, TeC=FnE x B). 


(a) Prove that f f C is a function from C to B and that for all c € C, f(c) = 
(f T O(c). 

(b) Suppose g: C > B. Prove that g =f I Ciffg Cf. 

(c) Let g and h be the functions defined in parts 2 and 3 of Example 5.1.3. 
Show thatg=h } Z. 


8. Suppose f: A > B and g Cf. Prove that there is a set A’ € A such that 
g: A' > B. 


9. Suppose f: A > B, B 4 ©, and A € A’. Prove that there is a function 
g: A' > B such that f © g. 


*10. Suppose that f and g are functions from A to B and f = g. Show that f g 
is not a function. 
11. Suppose A is a set. Show that i, is the only relation on A that is both 
an equivalence relation on A and also a function from A to A. 
12. Suppose f: A > Candg:B > C. 
(a) Prove that if A and B are disjoint, then fU g: AU B > C. 
(b) Prove that fU g: AU B > Ciff f l? (An B)=g Tt (An B). (See 
exercise 7 for the meaning of the notation used here.) 
*13. Suppose R is a relation from A to B, S is a relation from B to C, Ran 
(R) = Dom (S) = B, andS°R:A > C. 
(a) Prove that S: B > C. 
(b) Give an example to show that it need not be the case that R: A > B. 
14. Suppose f: A > B and S is a relation on B. Define a relation R on A as 
follows: 


R = {(x,y) EA x A| (f(x), f(y)) € S}. 


(a) Prove that if S is reflexive, then so is R. 
(b) Prove that if S is symmetric, then so is R. 
(c) Prove that if S is transitive, then so is R. 
*15. Suppose f: A > Band R is a relation on A. Define a relation S on B as 
follows: 


S = {(x,y) € BxB | du € Adv € A(f (u) =xAf(v) = yA(u,v) € R)}. 


Justify your answers to the following questions with either proofs or 
counterexamples. 

(a) If R is reflexive, must it be the case that S is reflexive? 

(b) If R is symmetric, must it be the case that S is symmetric? 

(c) If R is transitive, must it be the case that S is transitive? 


16. Suppose A and B are sets, and let 7 = {f | f: A > B}. Also, suppose R 


is a relation on B, and define a relation S on Fas follows: 


S={(f.g)€F x F| Wx € A(( f(x), 2g(x)) € R)}. 


Justify your answers to the following questions with either proofs or 

counterexamples. 

(a) If R is reflexive, must it be the case that S is reflexive? 

(b) If R is symmetric, must it be the case that S is symmetric? 

(c) If R is transitive, must it be the case that S is transitive? 

17. Suppose A is a nonempty set and f: A > A. 

(a) Suppose there is some a € A such that Vx € A(f(x) = a). (In this case, 
f is called a constant function.) Prove that for all g: A > A, fog =f. 

(b) Suppose that for all g: A > A, f °g = f. Prove that f is a constant 
function. (Hint: What happens if g is a constant function?) 


18. Let F= {fI f R > R}. Let R= {8 g) E Fx F| da E RVx > 
a(f(x)=g(x))}. 
(a) Let f: R > Rand g: R > R be the functions defined by the formulas 
f(x) = |x| and g(x) = x. Show that (f, g) E R. 
(b) Prove that R is an equivalence relation. 
*19, Let F= {f| f. Zt > R}. For g E F we define the set O(g) as 


follows: 
Olg) = {f € F | 3a e Zt3c e RTVx > a(|f(x)| < cle(x)))}. 


(If f E O(g), then mathematicians say that “fis big-oh of g.”) 

(a) Let f. Z* > R and g: Z* > R be defined by the formulas f(x)= 7 x + 3 
and g(x) =x. Prove that f E O(g), but g € / O(f). 

(b) Let S= {(f, g) E 7x F| f E O(g)}. Prove that S is a preorder, but not 
a partial order. (See exercise 25 of Section 4.5 for the definition of 
preorder.) 

(c) Suppose fı E O(g) and f, E O(g), and s and t are real numbers. Define 
a function f: Z* > R by the formula f(x) = sf,(x) + tf,(x). Prove that f 
€ O(g). (Hint: You may find the triangle inequality helpful. See 
exercise 13(c) of Section 3.5.) 

20. (a) Suppose g: A > B and let R = {(x, y) E A x A | g(x) = g(y)}. 

Show that R is an equivalence relation on A. 


(b) Suppose R is an equivalence relation on A and let g: A > A/R be the 
function defined by the formula g(x) = [x]p. Show that R = {(x, y) E 

A x A | g(x) = 9)t- 
*21. Suppose f: A > B and R is an equivalence relation on A. We will say 


that f is compatible with R if Vx E AVy E A(xRy > f(x) = f(y)). (You 
might want to compare this exercise to exercise 24 of Section 4.5.) 


(a) Suppose f is compatible with R. Prove that there is a unique function 
h: A/R > B such that for all x E A, h([x]p) = f(x). 


(b) Suppose h: A/R > B and for all x E A, h([x]p) = f(x). Prove that f is 
compatible with R. 

22. Let R = {(x, y) © N x N | x = y (mod 5)}. Note that by Theorem 
4.5.10 and exercise 14 in Section 4.5, R is an equivalence relation on 
N. 

(a) Show that there is a unique function h: N/R > N/R such that for every 
natural number x, h([x]p) = [x2]. (Hint: Use exercise 21.) 

(b) Show that there is no function h: N/R > N/R such that for every 


natural number x, A([X]p) = [2*]p. 


5.2 One-to-One and Onto 


In the last section we saw that the composition of two functions is again a 
function. What about inverses of functions? If f: A > B, then f is a relation 
from A to B, so f t is a relation from B to A. Is it a function from B to A? 
We’ ll answer this question in the next section. As we will see, the answer 
hinges on the following two properties of functions. 


Definition 5.2.1. Suppose f: A > B. We will say that f is one-to-one if 
sda; € Ada2 € A( f(a) = f(a2) Aa, # a2). 
We say that f maps onto B (or just is onto if B is clear from context) if 


Vb e Bia € A(f(a) = b). 


One-to-one functions are sometimes also called injections, and onto 
functions are sometimes called surjections. 


Note that our definition of one-to-one starts with the negation symbol -. 
In other words, to say that f is one-to-one means that a certain situation does 
not occur. The situation that must not occur is that there are two different 
elements of the domain of f, a, and ay, such that f(a,) = f(a). This situation 


is illustrated in Figure 5.4(a). Thus, the function in Figure 5.4(a) is not one- 
to-one. Figure 5.4(b) shows a function that is one-to-one. 


(a) f is not one-to-one. (b) f is one-to-one. 


Figure 5.4. 


If f. A > B, then to say that f is onto means that every element of B is the 
image under f of some element of A. In other words, in the diagram of f, 
every element of B has an edge pointing to it. Neither of the functions in 
Figure 5.4 is onto, because in both cases there are elements of B without 
edges pointing to them. Figure 5.5 shows two functions that are onto. 


(a) f is onto but not one-to-one. (b) f is both one-to-one and onto. 


Figure 5.5. 


Example 5.2.2. Are the following functions one-to-one? Are they onto? 


1. The function F from part 1 of Example 5.1.2. 
2. The function L from part 3 of Example 5.1.2. 
3. The identity function i,, for any set A. 
4 


The function g from part 2 of Example 5.1.3. 
5. The function h from part 3 of Example 5.1.3. 


Solutions 


1. F is not one-to-one because F (1)= 5 =F (3). It is also not onto, 
because 6 € B but there is no a € A such that F(a) = 6. 

2. L is not one-to-one because there are many pairs of different cities c4 
and c, for which L(c,)=L(c,). For example,L(Chicago)= United States 
=[ (Seattle). To say that L is onto means that Vn E€ NAc E C(L(c) = 
n), or in other words, for every country n there is a city c such that the 
city c is located in the country n. This is probably true, since it is 
unlikely that there is a country that contains no cities at all. Thus, L is 
probably onto. 

3. To decide whether i, is one-to-one we must determine whether there 
are two elements a, and a, of A such that i,(a,) = i,(a) and a, # ap. 


But as we saw in Section 5.1, for every a E A, i,(a) = a, so ig(a;) = 
i,((a>) Means a, = dy. Thus, there cannot be elements a, and a, of A 
such that i,(a,) = i,(a>) and a, # dp, so i, is One-to-one. 


To say that i, is onto means that for every a € A, a = i,(b) for some b 
€ A. This is clearly true because, in fact, a = i,(a). Thus i, is also onto. 


4. As in solution 3, to decide whether g is one-to-one, we must 
determine whether there are integers n, and n, such that g(n,) = g(n>) 


and n; # n. According to the definition of g, we have 


o(ny) = g(n2) iff 271 +3 = 2n2 +3 
iff 27; = 2n2 


iff my = n2. 


Thus there can be no integers nį and n, for which g(n,) = g(nz) and n; # 
n. In other words, g is one-to-one. However, g is not onto because, for 
example, there is no integer n for which g(n) = 0. To see why, suppose n is 
an integer and g(n) = 0. Then by the definition of g we have 2n + 3 = 0, so 
n = —3/2. But this contradicts the fact that n is an integer. Note that the 
domain of g is Z, so for g to be onto it must be the case that for every real 


number y there is an integer n such that g(n) = y. Since we have seen that 
there is no integer n such that g(n) = 0, we can conclude that g is not onto. 


5. This function is both one-to-one and onto. The verification that h is 
one-to-one is very similar to the verification in solution 4 that g is 
one-to-one, and it is left to the reader. To see that h is onto, we must 
show that Vy © Rax E R(h(x) = y). Here is a brief proof of this 


Statement. Let y be an arbitrary real number. Let x = (y —-3)/2. Then 
g(x) = 2x +3 = 2-((y -3)/2)+3 = y- 3 + 3 =y. Thus, Vy E RA x E R 
(h(x)=y), so h is onto. 


Although the definition of one-to-one is easiest to understand when it is 
Stated aS a negative statement, as in Definition 5.2.1, we know from 
Chapter 3 that the definition will be easier to use in proofs if we reexpress it 
as an equivalent positive statement. The following theorem shows how to 
do this. It also gives a useful equivalence for the definition of onto. 


Theorem 5.2.3. Suppose f: A > B. 
1. fis one-to-one iff Va, E A Vay E A(f (a) = flay) > ay, = ap). 
2. fis onto iff Ran (f) = B. 

Proof. 


1. We use the rules from Chapters 1 and 2 for reexpressing negative 
State-ments as positive ones. 
f is one-to-one iff ~Ja € Ada2 € A( f(a) = f(a2) Aa, # a2) 
iff Va; € AYa € A~( f (a1) = f (a2) Aa, Æ a2) 
iff Va; € AVa2 € A( f (a1) Æ fla2) Y ay = a2) 


iff Vay € AVa2 € A( f (a1) = f(a2) > ai = a2). 
2. First we relate the definition of onto to the definition of range. 


f is onto iff Yb e Baa € A( f(a) = b) 
iff Yb e Boa e A((a,b) € f) 
iff Yb e B(b e Ran(f)) 
iff B C Ran(f). 


Now we are ready to prove part 2 of the theorem. 

(>) Suppose f is onto. By the equivalence just derived we have B © 
Ran (f), and by the definition of range we have Ran (f © B. Thus, it 
follows that Ran (f) = B. 

(-) Suppose Ran (f = B. Then certainly B © Ran (f), so by the 


equivalence, f is onto. 
L 


Commentary. It is often most efficient to write the proof of an iff statement 
as a String of equivalences, if this can be done. In the case of statement 1 
this is easy, using rules of logic. For statement 2 this strategy doesn’t quite 
work, but it does give us an equivalence that turns out to be useful in the 
proof. 


Example 5.2.4. Let A= R \ {-1}, and define f: A > R by the formula 


a 


f(a j= “ 
at+l 


Prove that f is one-to-one but not onto. 
Scratch work 


By part 1 of Theorem 5.2.3, we can prove that f is one-to-one by proving 
the equivalent statement Va, E AVa, E A(f (a,) = f(a) > a, = a>). Thus, 
we let a, and a, be arbitrary elements of A, assume f(a,) = f(a»), and then 
prove da, = av. This is the strategy that is almost always used when proving 
that a function is one-to-one. The remaining details of the proof involve 
only simple algebra and are given later. 

To show that f is not onto we must prove ~Vx E Rda €E A(f (a) = x). 


Reexpressing this as a positive statement, we see that we must prove dx © 
RVa €E A(f (a) Ż x), so we should try to find a particular real number x such 


that Va E€ A(f (a) # x). Unfortunately, it is not at all clear what value we 
should use for x. We’ll use a somewhat unusual procedure to overcome this 
difficulty. Instead of trying to prove that f is not onto, let’s try to prove that 
it is onto! Of course, we’re expecting that this proof won’t work, but maybe 
seeing why it won’t work will help us figure out what value of x to use in 
the proof that f is not onto. 

To prove that fis onto we would have to prove Vx E Rda €E A(f (a) = x), 


so we should let x be an arbitrary real number and try to find some a E A 
such that f(a) = x. Filling in the definition of f, we see that we must find a © 
A such that 

2a 


a+ 


X. 


To find this value of a, we simply solve the equation for a: 


2a l , ; X 
=x => Da=axt+x > aļll-x)=x > a= ; 
a+ l 2—x 


Aha! The last step in this derivation wouldn’t work if x = 2, because then 
we would be dividing by 0. This is the only value of x that seems to cause 
trouble when we try to find a value of a for which f(a) = x. Perhaps x = 2 is 
the value to use in the proof that f is not onto. 


Let’s return now to the proof that f is not onto. If we let x = 2, then to 
complete the proof we must show that Va E A(f (a) # 2). We’ll do this by 
letting a be an arbitrary element of A, assuming f(a) = 2, and then trying to 
derive a contradiction. The remaining details of the proof are not hard. 


Solution 


Proof. To see that f is one-to-one, let a, and a, be arbitrary elements of A 
and assume that f(a,) = f(a). Applying the definition of f, it follows that 
2a,/(a, + 1) = 2a/(ay + 1). Thus, 2a,(a, + 1) = 2a,(a, + 1). Multiplying out 
both sides gives us 2a,a) +2a, = 2a,qa +2a,, so 2a, = 2a, and therefore a, 
=). 

To show that f is not onto we will prove that Va E A(f (a) # 2). Suppose 
a E A and f(a) = 2. Applying the definition of f, we get 2 aa + 1)= 2. 
Thus, 2a = 2a + 2, which is clearly impossible. Thus, 2 É Ran(f), so Ran(f) 
Z R and therefore f is not onto. 


O 


As we saw in the preceding example, when proving that a function f is 
one-to-one it is usually easiest to prove the equivalent statement Va, © 


AVa, E A(f (a) = f(a) > a, = a) given in part 1 of Theorem 5.2.3. Of 


course, this is just an example of the fact that it is generally easier to prove 
a positive statement than a negative one. This equivalence is also often used 
in proofs in which we are given that a function is one-to-one, as you will 
see in the proof of part 1 of the following theorem. 


Theorem 5.2.5. Suppose f: A > B and g: B > C. As we saw in Theorem 
5.1.5, it follows that g ° f. A > C. 


1. Iff and g are both one-to-one, then so is g ° f. 
2. If fandg are both onto, then so is g ° f. 


Proof. 


1. Suppose f and g are both one-to-one. Let a, and a, be arbitrary 
elements of A and suppose that (g ° f)(a,) = (g ° f)(a>). By Theorem 
5.1.5 this means that g(f (a,)) = g(f (a2)). Since g is one-to-one it 


follows that f(a,) = f(a>), and similarly since f is one-to-one we can 
then conclude that a, = a5. Thus, g ° fis one-to-one. 


2. Suppose f and g are both onto, and let c be an arbitrary element of C. 
Since g is onto, we can find some b € B such that g(b) =c. Similarly, 
since f is onto, there is some a € A such that f(a) = b. Then (g ° f(a) 
= g(f (a))= g(b) =c. Thus, g ° fis onto. 


Commentary. 


1. As in Example 5.2.4, we prove that g ° f is one-to-one by proving that 
Va, E A Ya, E A(G ef)(a,)= (g Aa) > a, =a>). Thus, we let a, 
and a, be arbitrary elements of A, assume that (g ° f)(a,) = (g ° PCa), 
which means g(f (a,)) = g(f (a>)), and then prove that a, = a. The 
next sentence of the proof says that the assumption that g is one-to- 
one is being used, but it might not be clear how it is being used. To 
understand this step, let’s write out what it means to say that g is one- 
to-one. As we observed before, rather than using the original 
definition, which is a negative statement, we are probably better off 
using the equivalent positive statement Vb, © BVb, E B(g(b,) = 
g(b) > b = by). The natural way to use a given of this form is to 
plug something in for b, and b». Plugging in f(a,) and f(a>), we get g(f 
(aı)) = g(f (a2)) > flay) = fla), and since we know g(f (a,)) = g(f 
(a2)), it follows by modus ponens that f(a,) = f(a). None of this was 
explained in the proof; readers of the proof are expected to work it out 
for themselves. Make sure you understand how, using similar 
reasoning, you can get from f(a,) = f(a) to a; = a, by applying the 
fact that f is one-to-one. 


2. After the assumption that f and g are both onto, the form of the rest of 
the proof is entirely guided by the logical form of the goal of proving 
that g ° fis onto. Because this means Vc © Cda E A((g ° f)(a) = c), 
we let c be an arbitrary element of C and then find some a € A for 
which we can prove (g ° f(a) =c. 


O 


Functions that are both one-to-one and onto are particularly important in 
mathematics. Such functions are sometimes called one-to-one corre- 
spondences or bijections. Figure 5.5(b) shows an example of a one-to-one 
correspondence. Notice that in this figure both A and B have four elements. 
In fact, you should be able to convince yourself that if there is a one-to-one 
cor-respondence between two finite sets, then the sets must have the same 
number of elements. This is one of the reasons why one-to-one 
correspondences are so important. We will discuss one-to-one 
correspondences between infinite sets in Chapter 8. 

Here’s another example of a one-to-one correspondence. Suppose A is the 
set of all members of the audience at a sold-out concert and S is the set of 
all seats in the concert hall. Let f: A > S be the function defined by the rule 


f(a) = the seat in which a is sitting. 


Because different people would not be sitting in the same seat, f is one-to- 
one. Because the concert is sold out, every seat is taken, so f is onto. Thus, f 
is a one-to-one correspondence. Even without counting people or seats, we 
can tell that the number of people in the audience must be the same as the 
number of seats in the concert hall. 


Exercises 


1. Which of the functions in exercise 1 of Section 5.1 are one-to-one? 
Which are onto? 

*2. Which of the functions in exercise 2 of Section 5.1 are one-to-one? 
Which are onto? 


3. Which of the functions in exercise 3 of Section 5.1 are one-to-one? 
Which are onto? 


4. Which of the functions in exercise 4 of Section 5.1 are one-to-one? 
Which are onto? 
*5. Let A=R \ {1}, and let f: A > A be defined as follows: 


x+1 


f(x) = . 
x— |l 


(a) Show that f is one-to-one and onto. 


(b 


Ne 


Show that f ° f = ig. 


6. Suppose a and b are real numbers and a ~ 0. Define f: R > R by the 
formula f(x) = ax + b. Show that f is one-to-one and onto. 

7. Define f: R* > R by the formula f(x) = 1/x - x. 

(a) Show that f is one-to-one. (Hint: You may find it useful to prove first 
that if 0 < a < b then f(a) > f (b).) 

(b) Show that fis onto. 

(c) Define g: R* > R by the formula g(x) = 1/x + x. Is g one-to-one? Is it 
onto? 


8. Let A= A(R). Define f: R > A by the formula f(x) = {y E€ R | y*< 
x}. 
(a) Find f (2). 
(b) Is f one-to-one? Is it onto? 
*9. Let A= A(R) and B= “A). Define f: B > A by the formula f (A= 
UF: 


(a) Find f({{1, 2}, {3, 4}}). 

(b) Is f one-to-one? Is it onto? 

10. Suppose f: A > Bandg: B > C. 

(a) Prove that if g ° fis onto then g is onto. 

(b) Prove that if g ° fis one-to-one then fis one-to-one. 

11. Suppose f: A > Bandg:B > C. 

(a) Prove that if f is onto and g is not one-to-one, then g ° f is not one-to- 
one. 

(b) Prove that if f is not onto and g is one-to-one, then g ° f is not onto. 

12. Suppose f: A > B. Define a function g: B > A(A) by the formula 
g(b)= {a E A | f(a) =b}. Prove that if f is onto then g is one-to-one. 
What if fis not onto? 

*13. Suppose f:A > B and C & A. In exercise 7 of Section 5.1 we defined 
f | C(the restriction of f to C), and you showed that f Ù C: C > B. 
(a) Prove that if f is one-to-one, then so is f I C. 
(b) Prove that if f Ù C is onto, then so is f. 


(c) 


14. 


(a) 
(b) 
15. 


16. 


*17. 


(a) 
(b) 
18. 


(a) 
(b) 
19. 


*20. 


(a) 
(b) 


21. 


Give examples to show that the converses of parts (a) and (b) are not 
always true. 


Suppose f: A > B, and there is some b € B such that Vx E A(f(x) = 
b). (Thus, f is a constant function.) 


Prove that if A has more than one element then f is not one-to-one. 
Prove that if B has more than one element then f is not onto. 


Suppose f: A > C, g: B > C, and A and B are disjoint. In exercise 
12(a) of Section 5.1 you proved that f U g: A U B > C. Now suppose 
that f and g are both one-to-one. Prove that f U g is one-to-one iff Ran 
(f) and Ran (g) are disjoint. 

Suppose R is a relation from A to B, S is a relation from B to C, 
Ran(R) = Dom (S) = B, and S ° R: A > C. In exercise 13(a) of Section 
5.1 you proved that S: B > C. Now prove that if S is one-to-one then 
R:A > B. 

Suppose f: A > B and R is a relation on A. As in exercise 15 of 
Section 5.1, define a relation S on B as follows: 


S = {(x,y)e BxB | Ju € AJv € A(f (u) = xA f (v) = yA(u,v) € R)}. 


Prove that if R is reflexive and fis onto then S is reflexive. 
Prove that if R is transitive and f is one-to-one then S is transitive. 


Suppose R is an equivalence relation on A, and let g: A > A/R be 
defined by the formula g(x) = [X]p, as in exercise 20(b) in Section 5.1. 


Show that g is onto. 

Show that g is one-to-one iff R = i,. 

Suppose f: A > B, R is an equivalence relation on A, and f is 
compatible with R. (See exercise 21 of Section 5.1 for the definition 
of compatible.) In exercise 21(a) of Section 5.1 you proved that there 
is a unique function h: A/R > B such that for all x E A, h([X]p) = f(x). 
Now prove that h is one-to-one iff Vx E AVy E A(f(x) = f(y) > xRy). 
Suppose A, B, and C are sets and f: A > B. 

Prove that if fis onto, g: B > C, h: B > C, and g ° f= h ° f, then g =h. 

Suppose that C has at least two elements, and for all functions g and h 
from B to C, if g ° f= h ° f then g = h. Prove that f is onto. 


Suppose A, B, and C are sets and f: B > C. 


(a) Prove that if fis one-to-one, g: A > B, h: A > B, and f° g = f° h, then 
g=h. 

(b) Suppose that A = ©, and for all functions g and h from A to B, if f° g 
= fo h then g = h. Prove that f is one-to-one. 


22. Let Z= {4{f|f.R > R}, and define a relation R on “as follows: 


R={(f.g2)€F xF |aheF(f =hog)}. 


(a) Let f, g, and h be the functions from R to R defined by the formulas 


f(x)=x? + 1, g(x) =x? + 1, and h(x) =x* + 1. Prove that ARf, but it is 
not the case that gRf. 

(b) Prove that R is a preorder. (See exercise 25 of Section 4.5 for the 
definition of preorder.) 

(c) Prove that for all f E Z, f Rip. 


(d) Prove that for all f E Z, ipRf iff f is one-to-one. (Hint for right-to-left 


direction: Suppose f is one-to-one. Let A = Ran (f), and let h = f t U 
(CR \ A) x {0}). Now prove that h: R > R and ig = h ° f.) 


(e) Suppose that g E Fis a constant function; in other words, there is 


some real number c such that Vx E R(g(x) = c). Prove that for all f E 
F, gRf. (Hint: See exercise 17 of Section 5.1.) 


(f) Suppose that g E Fis a constant function. Prove that for all f E 7 f 


Rg iff f is a constant function. 
(g) As in exercise 25 of Section 4.5, if we let S = R n RE, then S is an 
equivalence relation on 7. Also, there is a unique relation T on S 


such that for all f and g in Z, [fls T [g]s iff f Rg, and T is a partial 
order on XS. Prove that the set of all one-to-one functions from R to 
R is the largest element of 7/S in the partial order T, and the set of all 
constant functions from R to R is the smallest element. 


23. Let f: N > N be defined by the formula f(n) = n. Note that we could 
also say that f: N > Z. This exercise will illustrate why, in Definition 


5.2.1, we defined the phrase “f maps onto B,” rather than simply “f is 
onto.” 

(a) Does f map onto N? 

(b) Does f map onto Z? 


5.3 Inverses of Functions 


We are now ready to return to the question of whether the inverse of a 
function from A to B is always a function from B to A. Consider again the 
function F from part 1 of Example 5.1.2. Recall that in that example we had 
A= {1, 2, 3}, B= {4, 5, 6}, and F = {(1, 5), (2, 4), (3, 5)}. As we saw in 
Example 5.1.2, F is a function from A to B. According to the definition of 
the inverse of a relation, Ft = {(5, 1), (4, 2), (5, 3)}, which is clearly a 
relation from B to A. But F`! fails to be a function from B to A for two 
reasons. First of all, 6 E B, but 6 isn’t paired with any element of A in the 
relation Ft. Second, 5 is paired with two different elements of A, 1 and 3. 
Thus, this example shows that the inverse of a function from A to B is not 
always a function from B to A. 

You may have noticed that the reasons why F’! isn’t a function from B to 
A are related to the reasons why F is neither one-to-one nor onto, which 
were discussed in part 1 of Example 5.2.2. This suggests the following 
theorem. 


Theorem 5.3.1. Suppose f: A > B. If f is one-to-one and onto, then ft: B 
> A. 


Proof. Suppose f is one-to-one and onto, and let b be an arbitrary element of 
B. To show that f! is a function from B to A, we must prove that 4! a € 
A((b, a) E f+), so we prove existence and uniqueness separately. 

Existence: Since f is onto, there is some a € A such that f(a) = b. Thus, 
(a, b) E f, so (b, a) E ft. 

Uniqueness: Suppose (b, a,) € f t and (b, ay) € f 1 for some aj, a, € A. 
Then (a4, b) E f and (a>, b) E f, so f(a,) = b = f(a). Since f is one-to-one, it 
follows that a, = dp. 


0O 


Commentary. The form of the proof is guided by the logical form of the 
statement that f t: B > A. Because this means Vb € BẸ! a E€ A((b, a) € 
f), we let b be an arbitrary element of B and then prove existence and 
uniqueness for the required a € A separately. Note that the assumption that 
f is onto is the key to the existence half of the proof, and the assumption that 
f is one-to-one is the key to the uniqueness half. 


Suppose f is any function from a set A to a set B. Theorem 5.3.1 says that 
a sufficient condition for f t to be a function from B to A is that f be one-to- 
one and onto. Is it also a necessary condition? In other words, is the 
converse of Theorem 5.3.1 true? (If you don’t remember what the words 
sufficient,znecessary, and converse mean, you should review Section 1.5!) 
We will show in Theorem 5.3.4 that the answer to this question is yes. In 
other words, if f t is a function from B to A, then f must be one-to-one and 
onto. 

If f': B > A then, by the definition of function, for every b € B there is 
exactly one a € A such that (b, a) E ft, and 


fo! (b) = the unique a € A such that (b,a) € = 


= the unique a € A such that (a,b) € f 


= the unique a € A such that f(a) = b. 


This gives another useful way to think about f t. If ft is a function from B 
to A, then it is the function that assigns, to each b € B, the unique a E A 
such that f(a) = b. The assumption in Theorem 5.3.1 that f is one-to-one and 
onto guarantees that there is exactly one such a. 

As an example, consider again the function f that assigns, to each person 
in the audience at a sold-out concert, the seat in which that person is sitting. 
As we saw at the end of the last section, f is a one-to-one, onto function 
from the set A of all members of the audience to the set S of all seats in the 
concert hall. Thus, f 1 must be a function from S to A, and for each s € S, 


f—'(s) = the unique a € A such that f(a) = s 
= the unique person a such that the seat in which a is sitting is s 


= the person who is sitting in the seat s. 


In other words, the function f assigns to each person the seat in which that 
person is sitting, and the function f t assigns to each seat the person sitting 
in that seat. 


Because f. A > Sand f t: S > A, it follows by Theorem 5.1.5 that f £ ° f: 
A > Aandfe ft: S > S. What are these functions? To figure out what the 
first function is, let’s let a be an arbitrary element of A and compute (ft ¢ f) 
(a). 
( fo! o f Xa) = ff la) 
= f7(the seat in which a is sitting) 
= the person sitting in the seat in which a is sitting 
= da. 
But recall that for every a € A, i,(a) = a. Thus, we have shown that Va © 
A((ft © f\(a) = ig(a)), so by Theorem 5.1.4, ft °- f = ią. Similarly, you 
should be able to check that f° f 1 = ig. 
When mathematicians find an unusual phenomenon like this in an 
example, they always wonder whether it’s just a coincidence or if it’s part 
of a more general pattern. In other words, can we prove a theorem that says 


that what happened in this example will happen in other examples too? In 
this case, it turns out that we can. 


Theorem 5.3.2. Suppose f is a function from A to B, and suppose that f t is 
a function from B to A. Then f t ° f =i, and f° f t = ig. 


Proof. Let a be an arbitrary element of A. Let b = f(a) E B. Then (a, b) E f, 
so (b, a) E f! and therefore f '(b) = a. Thus, 


(f! o Pa) = f"(fla)) = f7'(b) =a = iala). 


Since a was arbitrary, we have shown that Va E A((f t © f(a) = i,(a)), so 
f t° f=i,. The proof of the second half of the theorem is similar and is left 


as an exercise (see exercise 8). 
L 


Commentary. To prove that two functions are equal, we usually apply Theo- 
rem 5.1.4. Thus, since f t ° f and i, are both functions from A to A, to prove 


that they are equal we prove that Va E A((f ! ° f)(a) = i,(a)). 


Theorem 5.3.2 says that if f. A > B and f 1. B > A, then each function 
undoes the effect of the other. For any a € A, applying the function f gives 
us f(a) E B. According to Theorem 5.3.2, f t (f(a))= (ft ° f\(@ = i4 (a)=a. 
Thus, applying f t to f(a) undoes the effect of applying f, giving us back the 
original element a. Similarly, for any b € B, applying f 1 we get f '(b) € A, 
and we can undo the effect of applying f t by applying f, since f(f '(b)) = b. 
For example, let f: R > R be defined by the formula f(x) = 2x. You 


should be able to check that f is one-to-one and onto, so f t: R > R, and for 
any x ER, 


f! (x)= the unique y such that f(y) =x. 


Because f t(x) is the unique solution for y in the equation f(y) = x, we can 
find a formula for f t(x) by solving this equation for y. Filling in the 
definition of f in the equation gives us 2y = x, so y = x/2. Thus, for every x 
E R, f (x) = x/2. Notice that applying f to any number doubles the number 
and applying f t halves the number, and each of these operations undoes the 
effect of the other. In other words, if you double a number and then halve 
the result, you get back the number you started with. Similarly, halving any 
number and then doubling the result gives you back the original number. 

Are there other circumstances in which the composition of two functions 
is equal to the identity function? Investigation of this question leads to the 
following theorem. 


Theorem 5.3.3. Suppose f: A > B. 
1. If there is a function g: B > A such that g ° f = i, then f is one-to-one. 


2. If there is a function g: B > A such that f ° g = ig then f is onto. 


Proof. 


1. Suppose g: B > A and g ° f = i}. Let a, and a, be arbitrary elements 
of A, and suppose that f(a,) = f(a). Applying g to both sides of this 
equation we get g(f (a,)) = g(f (a»)). But g(f (a) = (g ° fay) = 
i,((a,)=a,, and similarly,g(f (a>))=a5. Thus, we can conclude that a, 
=a, and therefore f is one-to-one. 


2. See exercise 9. 
O 


Commentary. The assumption that there is a g: B > A such that g ° f= iy 
is an existential statement, so we immediately imagine that a particular 
function g has been chosen. The proof that f is one-to-one follows the 
usual pattern for such proofs, based on Theorem 5.2.3. 


We have come full circle. In Theorem 5.3.1 we found that if f is a one- 
to-one, onto function from A to B, then f! is a function from B to A. From 
this conclusion it follows, as we showed in Theorem 5.3.2, that the 
composition of f with its inverse must be the identity function. And in 
Theorem 5.3.3 we found that when the composition of two functions is 
the identity function, we are led back to the properties one-to-one and 
onto! Thus, combining Theorems 5.3.1—5.3.3, we get the following 
theorem. 


Theorem 5.3.4. Suppose f. A > B. Then the following statements are 
equivalent. 

1. fis one-to-one and onto. 

2. ft:B > A. 

3. There is a function g: B > A such that g ° f= i, and f ° g = ip. 


Proof.1 > 2. This is precisely what Theorem 5.3.1 says. 
2> 3. Suppose f t: B > A. Let g = f t and apply Theorem 5.3.2. 


3> 1. Apply Theorem 5.3.3. 
L 


Commentary. As we saw in Section 3.6, the easiest way to prove that 
several statements are equivalent is to prove a circle of implications. In 


this case we have proven the circle 1 > 2 > 3 > 1. Note that the proofs 
of these implications are quite sketchy. You should make sure you know 
how to fill in all the details. 


For example, let f and g be functions from R to R defined by the 
following formulas: 


Then for any real number x, 


; P X 
(go f\(x) = 2e(f(x*)) =e ( z 


Thus, g ° f = ig. A similar computation shows that f° g = ig. Thus, it 


follows from Theorem 5.3.4 that f must be one-to-one and onto, and ft 
must also be a function from R to R. What is f t? Of course, a logical 
guess would be that ft = g, but this doesn’t actually follow from the 
theorems we’ve proven. You could check it directly by solving for f +(x), 


using the fact that f t(x) must be the unique solution for y in the equation 
f(y) = x. However, there is no need to check. The next theorem shows that 


f1 must be equal to g. 
Theorem 5.3.5. Suppose f: A > B, g: B > A, g° f=i,, and f ° g = ig. 
Then g= f+. 


Proof. By Theorem 5.3.4, f t: B > A. Therefore, by Theorem 5.3.2, f 1 ° f 
= iy. Thus, 


g=i,40g8 (exercise 9 of Section 4.3) 
= { f of)og 
= fo (fog) (Theorem 4.2.5) 
fl oip 
= f` (exercise 9 of Section 4.3). 


Commentary. This proof gets the desired conclusion quickly by clever use 
of previous theorems and exercises. For a more direct but somewhat 
longer proof, see exercise 10. 


Example 5.3.6. In each part, determine whether or not f is one-to-one 
and onto. If it is, find ft. 


1. LetA=R\ {0} and B= R \ {2}, and define f: A > B by the formula 
l l 
f(x) = — +2. 
x 


(Note that for all x E A, 1/x is defined and nonzero, so f(x) # 2 and 
therefore f(x) E B.) 


2. Let A= R and B= {x E R | x = 0}, and define f: A > B by the 
formula 


> 


ik 3k a 
Solutions 


1. You can check directly that f is one-to-one and onto, but we won’t 
bother to check. Instead, we’ll simply try to find a function g: B > 
A such that g ° f = i, and f ° g = ig. We know by Theorems 5.3.4 and 
5.3.5 that if we find such a g, then we can conclude that f is one-to- 
one and onto and g = f t. 


Because we’re hoping to have g = f t, we know that for any x € B = 
R \ {2}, g(x) must be the unique y € A such that f(y) = x. Thus, to find a 
formula for g(x), we solve for y in the equation f(y) = x. Filling in the 
definition of f, we see that the equation we must solve is 


l 
—+2=x. 
y 


Solving this equation we get 


l l l 
—-+2=x > —-=x-2 > y= . 
y y , x-—2 


Thus, we define g: B > A by the formula 


g(x) = >: 
X ka 

(Note that for all x E B, x # 2, so 1%⁄x — 2) is defined and nonzero, and 

therefore g(x) E A.) Let’s check that g has the required properties. For 

any x € A, we have 


l 


l l 
e(f(x))=2 (— +2) = — =y. 
x L/x 


Thus, g o f = iĄ. Similarly, for any x € B, 


l l 
fig) =f (=) = ———_ +2=x-2+2=x, 


\¥—2 1/(x — 2) 


so f° g # ig. Therefore, as we observed earlier, f must be one-to-one and 

onto, and g = f £. 

2. Imitating the solution to part 1, let’s try to find a function g: B > A 
such that g ef = i, and f °g = ig. Because applying f to a number 


squares the number and we want g to undo the effect of f, a 
reasonable guess would be to let g(x) = yx. Let’s see if this works. 


For any x € B we have 
f(9(x)) = f (VX) = (Vx)” =x, 


so f° g = ig. But for x E A we have 


Q(f (x)) = g(x") = V x+, 


and this is not always equal to x. For example, o( ¢(—3)) = \/(—3)? = 
V9 =3 4 —3. Thus, g ° f = iy. This example illustrates that you must 
check both f ° g = ig and g ° f = i}. It is possible for one to work but not 
the other. 

What went wrong? We know that if f 1 is a function from B to A, then 
for any x € B, f t(x) must be the unique solution for y in the equation 
f(y) = x. Applying the definition of f gives us y? = x, so y = +x. Thus, 
there is not a unique solution for y in the equation f(y) = x; there are two 
solutions. For example, when x = 9 we get y = +3. In other words, f (3) 
= f (-3) = 9. But this means that f is not one-to-one! Thus, ft is not a 
function from B to A. 


Functions that undo each other come up often in mathematics. For 
example, if you are familiar with logarithms, then you will recognize the 
formulas 10!°8 * = x and log 10% = x. (We are using base-10 logarithms 
here.) We can rephrase these formulas in the language of this section by 
defining functions f: R > R* and g: R* > R as follows: 


f(x) = 10", g(x) = log x. 


Then for any x € R we have g(f(x)) = log 10* = x, and for any x € R$, 
f(g(x))= 10!°8 X = x. Thus, g ° f = ig and f° g = ig+, so g = ft. In other 
words, the logarithm function is the inverse of the “raise 10 to the power” 
function. 


We saw another example of functions that undo each other in Section 4.5. 
Suppose A is any set, let Æ be the set of all equivalence relations on A, and 


let Y be the set of all partitions of A. Define a function f: E > by the 
formula f(R) = A/R, and define another function g: Y > Eby the formula 


g(F ) = the equivalence relation determined by F 


= © (XxX). 


XeF 
You should verify that the proof of Theorem 4.5.6 shows that f ° g = ig, and 


exercise 10 in Section 4.5 shows that g ° f = ig. Thus, f is one-to-one and 


onto, and g = f |. One interesting consequence of this is that if A has a finite 
number of elements, then we can say that the number of equivalence 
relations on A is exactly the same as the number of partitions of A, even 
though we don’t know what this number is. 


Exercises 


*1. Let R be the function defined in exercise 2(c) of Section 5.1. In 
exercise 2 of Section 5.2, you showed that R is one-to-one and onto, 

so Rt: P > P. If p E P, what is R_ (p)? 
2. Let F be the function defined in exercise 4(b) of Section 5.1. In 
exercise 4 of Section 5.2, you showed that F is one-to-one and onto, 


+3. 


so F`t: B > B. If X € B, what is F (X)? 
Let f: R > R be defined by the formula 


, 2x +5 
f (x)= : 


Show that f is one-to-one and onto, and find a formula for f t(x). (You 
may want to imitate the method used in the example after Theorem 5.3.2, 
or in Example 5.3.6.) 


4. 


TO: 


(a) 
(b) 


8. 


(b) 


Let f: R > R be defined by the formula f(x) = 2x? - 3. Show that f is 


one-to-one and onto, and find a formula for f t(x). 
Let f: R > R* be defined by the formula f(x) = 10?™*. Show that f is 


one-to-one and onto, and find a formula for f t(x). 


Let A = R \ {2}, and let f be the function with domain A defined by 
the formula 


f(x) = 


Show that f is a one-to-one, onto function from A to B for some set 
BC R. What is the set B? 


Find a formula for f t(x). 

In the example after Theorem 5.3.4, we had f(x) = (x + 7)/ and found 
that f t(x) = 5x — 7. Let f, and f, be functions from R to R defined by 
the formulas 


: . x 

fi(x) =x +7, f2o(x) = = 

Show that f= f ° fy. 

According to part 5 of Theorem 4.2.5, ft = (f> ° ft = t ° ft. 
Verify that this is true by computing f4 t ef ! directly. 


(a) Prove the second half of Theorem 5.3.2 by imitating the proof of 
the first half. 
Give an alternative proof of the second half of Theorem 5.3.2 by 
applying the first half to f 1. 


*9, Prove part 2 of Theorem 5.3.3. 

10. Use the following strategy to give an alternative proof of Theorem 
5.3.5: Let (b, a) be an arbitrary element of B x A. Assume (b, a) E g 
and prove (b, a) € f t. Then assume (b, a) E ft and prove (b, a) E€ 
g. 

*11. Suppose f: A > Band g: B > A 

(a) Prove that if f is one-to-one and f ° g = ip, then g = f +. 

(b) Prove that if fis onto and g ° f = iy, then g = f t. 

(c) Prove that if f° g = ig but g ° f = ig, then f is onto but not one-to-one, 

and g is one-to-one but not onto. 
12. Suppose f: A > B and f is one-to-one. Prove that there is some set B 
© B such that f t: B > A. 
13. Suppose f: A > B and f is onto. Let R= {(x, y) E A x A | f(x) = f)}. 
By exercise 20(a) of Section 5.1, R is an equivalence relation on A. 
(a) Prove that there is a function h: A/R > B such that for all x € A, 
h([x]p) =f(x). (Hint: See exercise 21 of Section 5.1.) 

(b) Prove that h is one-to-one and onto. (Hint: See exercise 19 of Section 
5.2.) 

(c) It follows from part (b) that ht: B > A/R. Prove that for all b € B, 
ht (b)= {x E A | f(x) =b}. 

(d) Suppose g: B > A. Prove that f° g = ip iff Vb E B(g(b) E h \(b)). 

*14. Suppose f: A > B, g: B > A, and f °g = ig. Let A’ = Ran (g) S A. 
(a) Prove that for all x E A’, (g ° A(x) =x. 
(b) Prove that f f A is a one-to-one, onto function from A’ to B and g = (f 
t A’). (See exercise 7 of Section 5.1 for the meaning of the 

notation used here.) 


15. Let B = {x e R| x => O}. Let f : R —> Bandg: B —> R be defined 
by the formulas f(x) = x* and g(x) = yx. AS we saw in part 2 of 
Example 5.3.6, g # f 1. Show that g = (f |Ì By +. (Hint: See exercise 
14.) 

*16. Letf:R > R be defined by the formula f(x) = 4x - x°. Let B = Ran 
(f). 


(a) Find B. 


(b) Find a set A S R such that f Ù A is a one-to-one, onto function from A 
to, and find a formula for (f t A) t(x). (Hint: See exercise 14.) 
17. Suppose A is a set, and let F= {f | f: A > A} and Y= {f E F| fis 


one-to-one and onto}. Define a relation R on Fas follows: 


R={(f.g)€ Fx F\laheP(f =h! ogoh)}. 


(a) Prove that R is an equivalence relation. 
(b) Prove that if f Rg then (f° f)R(g ° g). 
(c) For any f E Fand a € A, if f(a) = a then we say that a is a fixed point 


of f. Prove that if f has a fixed point and f Rg, then g also has a fixed 
point. 


*18. Suppose f: A > C, g: B > C, and g is one-to-one and onto. Prove that 
there is a function h: A > B such that g°h=f. 


5.4 Closures 


Often in mathematics we work with a function from a set to itself. In that 
situation, the following concept can be useful. 


Definition 5.4.1. Suppose f: A > A and C € A. We will say that C is 
closed under f if Vx E C(f(x) E ©). 


Example 5.4.2. 
1. Let A= {a, b, c, d} and f= {(a, c), (b, b), (c, d), (d, c)}. Then f: A > 
A. Let G} = {a, c, d} and C, = {a, b}. Is C} closed under f? Is C,? 
2. Letf: R > Randg: R > R be defined by the formulas f(x) = x + 1 
and g(x) = x — 1. Is N closed under f? Is it closed under g? 


3. Let f: R > R be defined by the formula f(x) = x*. Let C1 = {x ER | 
O<x <1} and G, = {x E R|0 <x < 2}. Is C} closed under f? Is 
C,? 


Solutions 


1. The set C, is closed under f, because f(a) = f(d) = c E C} and f(c) = 
d E C,. However, C, is not closed under f, because a E C, but f(a) 
= C E C>. 

2. For every natural number n, n + 1 is also a natural number, so N is 


closed under f. However, N is not closed under g, because 0 E N 
but g(0) = -1 EN. 


3. For every real number x, if 0 < x < 1 then 0 < x? < 1 (see Example 
3.1.2), so G} is closed under f. But 1.5 E€ C, and f (1.5) = 1.5? = 


2.25 € G, so C; is not closed under f. 


We saw in part 2 of Example 5.4.2 that N is not closed under the function 
g: R > R defined by the formula g(x) = x — 1. Suppose we wanted to add 
elements to N to get a set that is closed under g. Since 0 € N, we’d need 


to add g(0) = —1. But if -1 were added to the set, then it would also have 
to contain g(-1) = -2, and if we threw in —2 then we’d also have to add 
g(-2)= -3. Continuing in this way, it should be clear that we’d have to 
add all of the negative integers to N, giving us the set of all integers, Z. 


But notice that Z is closed under g, because for every integer n, n — 1 is 
also an integer. So we have succeeded in our task of enlarging N to get a 
set closed under g. 

When we enlarged N to Z, the numbers we added — the negative 


integers — were numbers that had to be added if we wanted the resulting 
set to be closed under g. It follows that Z is the smallest set containing N 


that is closed under g. We are using the word smallest here in exactly the 
way we defined it in Section 4.4. If we let F= {C S R| N © Cand Cis 


closed under g}, then Z is the smallest element of Z; where as usual it is 


understood that we mean smallest in the sense of the subset partial order. 
In other words, Z is an element of Z; and it’s a subset of every element of 


F. We will say that Z is the closure of N under g. 


Definition 5.4.3. Suppose f: A > A and B € A. Then the closure of B 
under f is the smallest set C © A such that B © C and C is closed under f, 
if there is such a smallest set. In other words, a set C & A is the closure of 
B under f if it has the following properties: 


1. BSC. 
2. Cis closed under f. 
3. For every set D € A, if B S D and D is closed under f then C € D. 


According to Theorem 4.4.6, if a set has a smallest element, then it can 
have only one smallest element. Thus, if a set B has a closure under a 
function f, then this closure must be unique, so it makes sense to call it the 
closure rather than a closure. However, as we saw in Example 4.4.7, some 
families of sets don’t have smallest elements, so it is not immediately 
clear if sets always have closures under functions. In fact they do, as we 
will show in our proof of Theorem 5.4.5 below. But first let’s look at a 
few more examples of closures. 


Example 5.4.4. 


1. In part 1 of Example 5.4.2, the set C, = {a, b} was not closed under 
f. What is the closure of C, under f? 


2. Let f: R > R be defined by the formula f(x) = x + 1, and let B = 
{0}. What is the closure of B under f? 


Solutions 


1. Since a E C,, to get a set closed under f we will need to add in f(a) 


= c. But then we’ll also have to add f(c) = d, giving us the entire set 
A = {a, b, c, d}. Clearly A is closed under f, so the closure of C, 


under f is A. 

2. Since 0 € B, the closure of B under f will have to contain f (0) = 1. 
But then it must also contain f (1) = 2, f (2) = 3, f (3) = 4, and in fact 
all positive integers. Adding all the positive integers to B gives us 
the set N, which we already know from part 2 of Example 5.4.2 is 


closed under f. Thus the closure of {0} under f is N. 


Here’s an example that illustrates the usefulness of the concepts we 
have been discussing. Let P be a set of people, and suppose that each 
person in the set P has a best friend who is also in P. Then we can define 
a function f: P > P by let f(p) = p’s best friend. Suppose that whenever 
someone in the set P hears a piece of gossip, he or she tells it to his or her 
best friend (but no one else). Now consider any set C € P, and suppose 
that C is closed under f. Then for any person p € C, p’s best friend is also 
in C. Thus, if any person in C hears a piece of gossip, the only person he 
or she will tell the gossip to is also in C. No one in C will ever transmit 
gossip to a person who is not in C. Thus, if we tell some people in C a bit 
of gossip, it may spread to other people in C, but it will never leave C. If 
you want to track the spread of gossip in this population, you should be 
interested in recognizing which subsets of P are closed under f. 


Suppose we tell a piece of gossip to all of the people in some set B © 
P. How will the gossip spread? The people in B will tell their best friends, 
and then they will tell their best friends, who will tell their best friends, 
and so on. Based on our previous examples, you might guess that the set 
H of people who eventually hear the gossip will be the closure of B under 
f. Let’s see if we can give a careful proof that H has the three properties 
listed in Definition 5.4.3. 


Clearly B € H, since the people in B hear the gossip right at the start of 
the process. This confirms property 1 of Definition 5.4.3. If p is any 
element of H, then p eventually hears the gossip. But as soon as p hears 
the gossip, he or she will tell f(p), so f(p) © H as well. Thus H is closed 
under f, as required by property 2 of the definition. Finally, suppose B © 
C © P and C is closed under f. Then as we observed earlier, any gossip 
that is told to the people in B may spread to others in C, but it will never 
leave C. Thus, everyone who ever hears the gossip must belong to C, 
which means that H € C. This confirms property 3, so H is indeed the 
closure of B under f. 

We turn now to the proof that closures always exist. Suppose f: A > A 
and B € A. One way to try to prove the existence of the closure of B 
under f is to add to B those elements that must be added to make it closed 
under f, as we did in earlier examples, and then prove that the result is 
closed under f. Although this can be done, a careful treatment of the 
details of this proof would require the method of mathematical induction, 


which we have not yet discussed. We will present this proof in Section 
6.5, after we’ve discussed mathematical induction. But there is another 
approach to the proof that uses only ideas that we have already studied. 
We know that the closure of B under f, if it exists, must be the smallest 
element of the family F ={C © A | B © C and C is closed under f }. 


According to exercise 20 of Section 4.4, the smallest element of a set is 
also always the greatest lower bound of the set, and by Theorem 4.4.11, 
the g.l.b. of any nonempty family of sets F is () F. This is the motivation 


for our next proof. 

Theorem 5.4.5. Suppose that f. A > A and B & A. Then B has a closure 
under f. 

Proof. Let F = {C © A| B © C and C is closed under f }. You should be 
able to check that A © F, and therefore F ~ ©. Thus, we can let 


C = () F, and by exercise 9 of Section 3.3, C S A. We will show that C is 
the closure of B under f by proving the three properties in Definition 
5.4.3. 


To prove the first property, suppose x E B. Let D be an arbitrary 
element of F. Then by the definition of F, B S D, so x E D. Since D was 


arbitrary, this shows that VD E F(x E D), so x € ) F = C. Thus, B © C. 
Next, suppose x € C and again let D be an arbitrary element of F. 

Then since x € C = QF, x € D. But since D € F, D is closed under f, 

so f(x) E D. Since D was arbitrary, we can conclude that VD E F(f (x) E 


D), so f(x) € QF = C. Thus, we have shown that C is closed under f, 
which is the second property in Definition 5.4.3. 

Finally, to prove the third property, suppose B © D S A and D is closed 
under f. Then D E F, and applying exercise 9 of Section 3.3 again we can 


conclude that C =(\F € D. O 


Commentary. Our goal is JC(C is the closure of B under f), so we should 
begin by defining C. However, the definition C = () F doesn’t make 


sense unless we know F # ©, so we must prove this first. Because F # © 
means 4D(D E F), we prove it by giving an example of an element of F. 
The example is A, so we must prove A E F. The statement in the proof 
that “you should be able to check” that A E F really does mean that you 
should do the checking. According to the definition of F, to say that A © 
F means that A € A, B C A, and A is closed under f. You should make 


sure you see why all three of these statements are true. 


Having defined C and verified that C © A, we must prove that C has 
the three properties in the definition of the closure of B under f. To prove 
the first statement, B © C, we let x be an arbitrary element of B and prove 
x E C. Since C = (\F, the goal x E C means VD E F(x E D), so to 


prove it we let D be an arbitrary element of F and prove x E D. To prove 


that C is closed under f, we assume that x E C and prove f(x) E C. Once 
again, by the definition of C this goal means VD E F(f (x) E D), so we 


let D be an arbitrary element of F and prove f(x) E D. Finally, to prove 


the third goal we assume that D € A, B © D, and D is closed under f and 
prove C S D. Fortunately, an exercise from an earlier section takes care 
of this proof. 


Closed sets and closures also come up in the study of functions of more 
than one variable. If f. A x A > A, then f is called a function of two 
variables. An element of the domain of f would be an ordered pair (x, y), 
where x, y € A. The result of applying f to this pair should be written f 
((x, y)), but it is customary to leave out one pair of parentheses and just 
write f(x, y). 


Definition 5.4.6. Suppose f: A x A > A and C CA. We will say that C is 
closed under f if Vx E CVy E C(fx, y) E C). 


Example 5.4.7. 


1. Let f: R* x R* > R* and g: R* x R* > R* be defined by the 
formulas f(x, y) = x/y and g(x, y) = x”. Is Q* closed under f? Is it 
closed under g? 

2. Let f AWN) x AWN) > AN) and g: AN) x AN) > AN) be 
defined by the formulas f(X, Y) = X U Y and g(X, Y) = X n Y. Let Z = 
{X E AN) | X is infinite}. Is Z closed under f? Is it closed under g? 


Solutions 


1. Ifx, y © Q’, then there are positive integers p, q, r, and s such that x 
= p/q and y = r/s. Therefore 


x je ) S 3 o 
fx. y) =- = pla = P =- = = € Qt. 
y 


r/s q r @qr 
This shows that Q* is closed under f. However, 2 and 1/2 are elements 
of Q* and g(2,1/2) = 2'/2 = /2 ¢ Q* (see Theorem 6.4.5), so Q* is 
not closed under g. 
2. IfX and Y are infinite sets of natural numbers, then f(X, Y) = X U Yis 
also infinite, so Z is closed under f. On the other hand, let E be the set 


of even natural numbers and let P be the set of prime numbers. Then 
E and P are both infinite, but g(E, P) = E n P = {2}, which is finite. 
Therefore Z is not closed under g. 


As before, we can define the closure of a set under a function of two 
variables to be the smallest closed set containing it, and we can prove that 
such closures always exist. 


Definition 5.4.8. Suppose f: A x A > A and B & A. Then the closure of B 
under f is the smallest set C € A such that B S C and C is closed under f, 
if there is such a smallest set. In other words, a set C & A is the closure of 
B under f if it has the following properties: 


1. BEC. 
2. Cis closed under f. 


3. For every set D € A, if B S D and D is closed under f then C € D. 


Theorem 5.4.9. Suppose that f. A x A > A and B & A. Then B has a 
closure under f. 


Proof. See exercise 11.0 


A function from A x A to A could be thought of as an operation that can 
be applied to a pair of objects (x, y) E A x A to produce another element 
of A. Often in mathematics an operation to be performed on a pair of 
mathematical objects (x, y) is represented by a symbol that we write 
between x and y. For example, if x and y are real numbers then x + y 
denotes another number, and if x and y are sets then x U y is another set. 
Imitating this notation, when mathematicians define a function from A x 
A to A they sometimes represent it with a symbol rather than a letter, and 
they write the result of applying the function to a pair (x, y) by putting the 
symbol between x and y, rather than by putting a letter before (x, y). When 
a function from A x A to A is written in this way, it is usually called a 
binary operation on A. 

For example, in part 2 of Example 5.4.7 we defined g: AN) x A(N) 
> PAN) by the formula g(X, Y) = X AY. Instead of introducing the name 


g for this function, we could have talked about n as a binary operation on 
AN). We showed in the example that the set Z of all infinite subsets of N 
is not closed under g. Another way to say this is that Z is not closed under 
the binary operation n. What is the closure of Z under ^n? For the answer, 
see exercise 16. 

Here’s another example. We could define a binary operation * on Z by 
saying that for any integers x and y, x * y = x* — y*. Is the set {0, 1} 
closed under the binary operation *? The answer is no, because 0 * 1 = 
02 - 12 = -1/€ {0, 1}. Thus, the closure of {0, 1} under * must include 


—1. But as you can easily check, {—1, 0, 1} is closed under *. Therefore 
the closure of {0, 1} under x is {-1, 0, 1}. 


Exercises 


Let f: R > R be defined by the formula f(x) = (x + 1)/2. Are the 
following sets closed under f? 

Z. 

Q. 

{xE R|O<x <4}. 

{xE R|2<x< 4}. 


Let f: P(N) > AN) be defined by the formula f(X) = X U {17}. Are 
the following sets closed under f? 

{X SN | X is infinite}. 

{X SN | X is finite}. 

{X © N | X has at most 100 elements}. 

{X S©N| 16 EX}. 


Let f: Z > Z be defined by the formula f(n) = n? — n. Find the closure 
of {-1, 1} under f. 

For any set A, the set of all relations on A is A(A x A). Let f: AA x 
A) > AA x A) be defined by the formula f(R) = Rt. Is the set of 
reflexive relations on A closed under f? What about the set of 


symmetric relations and the set of transitive relations? (Hint: See 
exercise 12 of Section 4.3.) 


Suppose f: A > A. Is © closed under f? 

Suppose f: A > A. 

Prove that if Ran(f) E C © A then C is closed under f. 

Prove that for every set B € A, the closure of B under f is a subset of 
B U Ran(f). 

Suppose f: A — A and f is one-to-one and onto. Then by Theorem 
5.3.1, ft: A > A. Prove that if C S A and C is closed under f, then A 
\ C is closed under f t. 


Suppose f: A > A and C C A. Prove that C is closed under f iff the 
closure of C under f is C. 


*9, 


(a) 
(b) 
(c) 
10. 


(a) 
(b) 
(c) 
(d) 
11. 
aD 


(a) 
(b) 


13. 


(a) 


Suppose f: A > A and C} and C, are subsets of A that are closed 
under f. 


Prove that C, U C, is closed under f. 
Must C, N C, be closed under f? Justify your answer. 
Must C;, \ C, be closed under f? Justify your answer. 


Suppose f: A > A, B, © A, and B, © A. Let C} be the closure of B, 
under f, and let C, be the closure of B}. 


Prove that if B4 S B, then C, © G3. 

Prove that the closure of B, U B, under fis C} U G3. 

Must the closure of B} N B, be C, N C,? Justify your answer. 
Must the closure of B, \ B be G4 \ C»? Justify your answer. 


Prove Theorem 5.4.9. 
If F is a set of functions from A to A and C € A, then we will say that 


C is closed under F if Vf E FNx E C(f(x) E C). In other words, C is 
closed under F iff for all f E F, C is closed under f. If B S A, then the 
closure of B under F is the smallest set C S A such that B © C and C 
is closed under F. (You are asked to prove in the next exercise that 


the closure always exists.) 
Let f and g be the functions from R to R defined by the formulas f(x) 


= x +1 and g(x) = x -1. Find the closure of {0} under {f, g}. 
For each natural number n, let f: ACN) > A(N) be defined by the 


formula f, (X) = X U {n}, and let F = {f, | n E N}. Find the closure 
of {©} under F. 


Suppose F is a set of functions from A to A and B & A. See the 


previous exercise for the definition of the closure of B under F. 


Prove that B has a closure under F. 


(b) 
(c) 


(d) 


¥*14. 


15. 


16. 
(a) 
(b) 


PVT, 


(a) 
(b) 
(c) 
(d) 


18. 


For each f E F, let Cr be the closure of B under f, and let C be the 
closure of B under F. 

Prove that User Cf © C 

Must U fez Cf be closed under F? Justify your answer with either a 


proof or a counterexample. 
Must |J jer Cr = C? Justify your answer with either a proof or a 
counterexample. 


Let f: R x R > R be defined by the formula f(x, y) = x — y. What is 
the closure of N under f? 


Let f: R* x R* > R* be defined by the formula f(x, y) = x/y. What is 
the closure of Z* under f? 


As in part 2 of Example 5.4.7, let Z = {X E A(N) | X is infinite}. 


Prove that for every set X S N there are sets Y, Z © Z such that Yn Z 
=X. 

What is the closure of Z under the binary operation n? 

Let F= {f| f R > R}. Then for any f, g E F, f ° g E F, so ° isa 
binary operation on F. Are the following sets closed under °? 

{f E F | fis one-to-one}. (Hint: See Theorem 5.2.5.) 

{f © F | fis onto}. 

{f E F | f is strictly increasing}. (A function f: R > R is strictly 
increasing if Vx E RVy E R(x < y > f(x) < f(y)).) 

{f © F | f is strictly decreasing}. (A function f: R > R is strictly 
decreasing if Vx E RVy E R(x < y > f(x) > f(y)).) 

Let F={f|f: R > R}. Iff, g E F, then we define the function f + g: 


R > R by the formula (f + g)(x) = f(x) + g(x). Note that + is a binary 
operation on F. Are the following sets closed under +? 


(a) 
(b) 
(c) 


(d) 


19. 


*20. 


(a) 
(b) 
(c) 
(d) 
(e) 
21. 


{f E F | fis one-to-one}. 
{f © F | fis onto}. 
{f © F | f is strictly increasing}. (See the previous exercise for the 


definition of strictly increasing.) 
{f © F | fis strictly decreasing}. (See the previous exercise for the 
definition of strictly decreasing.) 
For any set A, the set of all relations on A is AA x A), and ° isa 
binary operation on “(A x A). Is the set of reflexive relations on A 
closed under °? What about the set of symmetric relations and the set 


of transitive relations? 
Division is not a binary operation on R, because you can’t divide by 


0. But we can fix this problem. We begin by adding a new element to 
IR. We will call the new element “NaN” (for “Not a Number”). Let 


E = RU {NaN}, and define f : R x R > Ras follows: 


in = 


l x/y, ifx,y € Rand y £0, 
f(x,y) = 
NaN, otherwise. 
This notation means that if x, y E R and y ~ 0 then f(x, y) = x/y, and 


otherwise f(x, y) = NaN. Thus, for example, f(3, 7) = 3/7, f(3, 0) = 
NaN, and f(NaN, 7) = NaN. Which of the following sets are closed 
under f? 


R. 

R*. 

R. 

Q. 

Q U{NaN}. 

If F is a set of functions from A x A to A and C € A, then we will say 
that C is closed under F if Vf E FVx E CVy E C(f(x, y) E C). In 
other words, C is closed under F iff for all f E F, C is closed under f. 


If B & A, then the closure of B under F is the smallest set C © A such 


that B © C and C is closed under fF, if there is such a smallest set. 


(Compare these definitions to the definitions in exercise 12.) 

(a) Prove that the closure of B under F exists. 

(b) Let f: R x R > Randg: RxR -> R be defined by the formulasf(x, 
y) = x+y and g(x, y) =xy. Prove that the closure of QU{./2} with ,/2 


under{f adjoined,, g} is the andset {a + bv? | a,b € Q}. is denoted 
Q(./2).) (This set is called Q 


(c) With f and g defined as in part (b), what is the closure of QU {4⁄2} 
under {f, g}? 


5.5. Images and Inverse Images: A Research 
Project 


Suppose f: A > B. We have already seen that we can think of f as 
matching each element of A with exactly one element of B. In this section 
we will see that f can also be thought of as matching subsets of A with 
subsets of B and vice-versa. 


Definition 5.5.1. Suppose f: A > B and X € A. Then the image of X 
under f is the set f(X) defined as follows: 
F(X) = {Ff (x) |x € X} 
= {b e B|ax e X(f(x) = b)}. 


(Note that the image of the whole domain A under f is {f(a) | a E A}, and 
as we saw in Section 5.1 this is the same as the range of f.) 


If Y S B, then the inverse image of Y under f is the set f t (Y) defined as 
follows: 


fUNY) ={aeA| fla) eY}. 


Note that the function f in Definition 5.5.1 may fail to be one-to-one or 
onto, and as a result f t may not be a function from B to A, and for y € B, 
the notation “ft (y)? may be meaningless. However, even in this case 
Definition 5.5.1 still assigns a meaning to the notation “ft (Y)” for Y S B. 


If you find this surprising, look again at the definition of f t (Y), and 
notice that it does not treat f t as a function. The definition refers only to 
the results of applying f to elements of A, not the results of applying f t to 
elements of B. 

For example, let L be the function defined in part 3 of Example 5.1.2, 
which assigns to each city the country in which that city is located. As in 
Example 5.1.2, let C be the set of all cities and N the set of all countries. 
If B is the set of all cities with population at least one million, then B is a 
subset of C, and the image of B under L would be the set 


L(B) = {L(b) | b € B} 
= {née N | 3b e B(L(b) = n)} 
= {n € N | there is some city with population at least 
one million that is located in the country n}. 
Thus, L(B) is the set of all countries that contain a city with population at 
least one million. Now let A be the subset of N consisting of all countries 
in Africa. Then the inverse image of A under L is the set 
L~'(A) = {c € C | L(c) € A} 
= {c € C | the country in which c is located is in Africa}. 
Thus, Lt (A) is the set of all cities in African countries. 
Let’s do one more example. Let f: R > R be defined by the formula 
f(x) = x’, and let X = {x E R| 0 <x < 2}. Then 


F(X) = {F(x) | x € X} = (x? | 0 < x < 2}. 


Thus, f(X) is the set of all squares of real numbers between 0O and 2 
(including 0 but not 2). A moment’s reflection should convince you that 
this set is {x E R | 0 < x < 4}. Now let’s let Y= {x E R| 0 <x < 4} and 


compute f t (Y). According to the definition of inverse image, 


fo) ={xeR| fœ) €Y¥} 
={xeRI|O< f(x) < 4} 
={xeR|0<x? <4} 


= {xe R|—-2 <x < 2}. 


By now you have had enough experience writing proofs that you 
should be ready to put your proof-writing skills to work in answering 
mathematical questions. Thus, most of this section will be devoted to a 
research project in which you will discover for yourself the answers to 
basic mathematical questions about images and inverse images. To get 
you started, we’ll work out the answer to the first question. 

Suppose f: A > B, and W and X are subsets of A. A natural question 
you might ask is whether or not f(W n X) must be the same as W) n 
f(X). It seems plausible that the answer is yes, so let’s see if we can prove 
it. Thus, our goal will be to prove that (W n X) = AW) n KX). Because 
this is an equation between two sets, we proceed by taking an arbitrary 
element of each set and trying to prove that it is an element of the other. 

Suppose first that y is an arbitrary element of (W^X). By the definition 
of AW n^ X), this means that y = f(x) for some x E W nN X. Since x E Wn 
X, it follows that x E W and x E X. But now we have y = f(x) and x E W, 
so we can conclude that y E f(W). Similarly, since y = f(x) and x € X, it 
follows that y E f(X). Thus, y E f(W) n AX). This completes the first half 
of the proof. 

Now suppose that y E AW) n KX). Then y E f(W), so there is some w 
€ W such that f(w) = y, and also y E f(X), so there is some x E X such 
that y = f(x). If only we knew that w and x were equal, we could conclude 
that w = x E W N X, so y = f(x) E f(W n X). But the best we can do is to 
say that w) = y = f(x). This should remind you of the definition of one- 
to-one. If we knew that f was one-to-one, we could conclude from the fact 
that f(w) = f(x) that w = x, and the proof would be done. But without this 
information we seem to be stuck. 

Let’s summarize what we’ve discovered. First of all, the first half of the 
proof worked fine, so we can certainly say that in general f(WnX) © 
f(W)n AX). The second half worked if we knew that f was one-to-one, so 
we can also say that if f is one-to-one, then f(Wn.X) = f(W)nf(X). But what 
if f isn’t one-to-one? There might be some way of fixing up the proof to 


show that the equation f(WnX) = f(W)nf(X) is still true even if f isn’t one- 
to-one. But by now you have probably come to suspect that perhaps f(W 
n X) and W) n f(X) are not always equal, so maybe we should devote 
some time to trying to show that the proposed theorem is incorrect. In 
other words, let’s see if we can find a counterexample — an example of a 
function f and sets W and X for which f(W n X) 4 f(W) n KX). 

Fortunately, we can do better than just trying examples at random. Of 
course, we know we’d better use a function that isn’t one-to-one, but by 
examining our attempt at a proof, we can tell more than that. The 
attempted proof that AW n X) = f(W) n f(X) ran into trouble only when W 
and X contained elements w and x such that w # x but f(w) = f(x), so we 
should choose an example in which this happens. In other words, not only 
should we make sure f isn’t one-to-one, we should also make sure W and 
X contain elements that show that f isn’t one-to-one. 


A B 


|__| 
P5 


Figure 5.6. 


The graph in Figure 5.6 shows a simple function that isn’t one-to-one. 
Writing it as a set of ordered pairs, we could say f = {(1, 4), (2, 5), (3, 5)} 
and f: A > B, where A = {1, 2, 3} and B = {4, 5, 6}. The two elements of 
A that show that f is not one-to-one are 2 and 3, so these should be 
elements of W and X, respectively. Why not just try letting W = {2} and X 
= {3}? With these choices we get f(W) = {f(2)} = {5} and f(X) = {f(3)} = 
{5}, so AW) 9 f(X) = {5} n {5} = {5}. But AW n X) = KØ) = Ø, so AW 
N X) # AW) n f(X). (If you’re not sure why AO) = ©, work it out using 


Definition 5.5.1!) If you want to see an example in which W n X # ©, try 
W = {1, 2} and X= {1, 3}. 

This example shows that it would be incorrect to state a theorem saying 
that AW n X) and f(W) n fX) are always equal. But our proof shows that 
the following theorem is correct: 


Theorem 5.5.2. Suppose f. A > B, and W and X are subsets of A. Then 
f(WnX) E f(W)nf(X). Furthermore, if f is one-to-one, then f(WnX) = AW) 
n AX) 


Now, here are some questions for you to try to answer. In each case, try 
to figure out as much as you can. Justify your answers with proofs and 
counterexamples. 


1. Suppose f: A > B and W and X are subsets of A. 
(a) Will it always be true that AW U X) = AW) U (X)? 


(b) Will it always be true that f(W \ X) = f(W) \ f(X)? 
(c) Will it always be true that W S X = (W) © KX)? 


2. Suppose f: A > Band Y and Z are subsets of B. 

(a) Will it always be true that f t (Y n Z) = f+ (Y) n f1 (2)? 

(b) Will it always be true that f t (YU Z) = f t (Y) u f+ (2)? 

(c) Will it always be true that f! (Y \ Z) = ft (Y) \ f+ (2)? 

(d) Will it always be true that YC Z e ft (Y) S ft (2)? 

3. Suppose f: A > B and X CA. Will it always be true that f t (f (X)) = 
X? 


4. Suppose f: A > B and Y S B. Will it always be true that fft (Y)) = 
Y? 


5. Suppose f: A > A and C C A. Prove that the following statements are 
equivalent: 

(a) Cis closed under f. 

b) KOSEC. 

(c) CEft(C). 

6. Suppose f: A > B and g: B > C. Can you prove any interesting 
theorems about images and inverse images of sets under g ° f? 


Note: An observant reader may have noticed an ambiguity in our notation 
for images and inverse images. If f. A > B and Y © B, then we have used 
the notation f t (Y) to stand for the inverse image of Y under f. But if f is 
one-to-one and onto, then, as we saw in Section 5.3, f t is a function from 
B to A. Thus, f t (Y) could also be interpreted as the image of Y under the 
function f t. Fortunately, this ambiguity is harmless, as the next problem 
shows. 


7. Suppose f: A > B, fis one-to-one and onto, and Y © B. Show that the 
inverse image of Y under f and the image of Y under f t are equal. 
(Hint: First write out the definitions of the two sets carefully!) 
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Mathematical Induction 


6.1. Proof by Mathematical Induction 


In Chapter 3 we studied proof techniques that could be used in reasoning 
about any mathematical topic. In this chapter we’ll discuss one more proof 
technique, called mathematical induction, that is designed for proving 
Statements about what is perhaps the most fundamental of all mathematical 
structures, the natural numbers. Recall that the set of all natural numbers is 
N= {0, 1, 2,3,...}. 


Suppose you want to prove that every natural number has some property 
P. In other words, you want to show that 0, 1, 2, . . . all have the property P. 
Of course, there are infinitely many numbers in this list, so you can’t check 
one-by-one that they all have property P. The key idea behind mathematical 
induction is that to list all the natural numbers all you have to do is start 
with 0 and repeatedly add 1. Thus, you can show that every natural number 
has the property P by showing that 0 has property P, and that whenever you 
add 1 to a number that has property P, the resulting number also has 
property P. This would guarantee that, as you go through the list of all 
natural numbers, starting with 0 and repeatedly adding 1, every number you 
encounter must have property P. In other words, all natural numbers have 
property P. Here, then, is how the method of mathematical induction works. 


To prove a goal of the form Vn E N P(n): 


First prove P(0), and then prove Vn E N(P (n) > P(n + 1)). The first of 


these proofs is sometimes called the base case and the second the induction 
step. 


Form of final proof: 


Base case: [Proof of P(0) goes here. ] 
Induction step: [Proof of Vn E N(P (n) > P(n + 1)) goes here.] 


We’ll say more about the justification of the method of mathematical 
induction later, but first let’s look at an example of a proof that uses 
mathematical induction. The following list of calculations suggests a 
surprising pattern: 


2=1=2!-] 
2°42! =14+2=3=2?-1 
294.984 22714244272? -1 
2942! 427427=142444+8=15=2'-1 


The general pattern appears to be: 
2° +. 2! Oe he i — pti = l. 


Will this pattern hold for all values of n? Let’s see if we can prove it. 


Example 6.1.1. Prove that for every natural number n, 2° + 2! +--+ +2” = 
Flad, 


Scratch work 
Our goal is to prove the statement Vn E N P(n), where P(n) is the 


statement 2° + 21 + - - +2” = 2"*1 — 1, According to our strategy, we can do 
this by proving two other statements, P(0) and Vn € N(P (n) > P(n + 1)). 


Plugging in 0 for n, we see that P(0) is simply the statement 2° = 2! - 1, 
the first statement in our list of calculations. The proof of this is easy — just 
do the arithmetic to verify that both sides are equal to 1. Often the base case 
of an induction proof is very easy, and the only hard work in figuring out 
the proof is in carrying out the induction step. 

For the induction step, we must prove Vn E N(P (n) > P(n + 1)). Of 
course, all of the proof techniques discussed in Chapter 3 can be used in 
mathematical induction proofs, so we can do this by letting n be an arbitrary 
natural number, assuming that P(n) is true, and then proving that P(n + 1) is 
true. In other words, we’ll let n be an arbitrary natural number, assume that 


20 +21 +.. +2" = "+l — 1, and then prove that 29+ 2! +--+ +2m*1 = pte — 
1. This gives us the following givens and goal: 


Givens Goal 
neN 294914... 4 PH — pnt2 _ | 
20 +2! E ose E =M] 


Clearly the second given is similar to the goal. Is there some way to start 
with the second given and derive the goal using algebraic steps? The key to 
the proof is to recognize that the left side of the equation in the goal is 
exactly the same as the left side of the second given, but with the extra term 
2"*1 added on. So let’s try adding 2”*t to both sides of the second given. 
This gives us 


(2°42) 4... 42") 4 2"tt = ("tl — 1) 4 2"41, 
or in other words, 

2942... APH ing. gntl— 7 aon? — 1, 
This is the goal, so we are done! 


Solution 


Theorem. For every natural number n, 2° + 21 +--+ +2” = 2*1 — 4, 


Proof. We use mathematical induction. 

Base case: Setting n = 0, we get 29 = 1 = 2! — 1 as required. 

Induction step: Let n be an arbitrary natural number and suppose that 2° 
+21 +. +2" = "1-1, Then 


20 ay Le ie a — 


Does the proof in Example 6.1.1 convince you that the equation 2° + 2! + 
-+ 2% = 2nt1 — 1, which we called P(n) in our scratch work, is true for all 
natural numbers n? Well, certainly P(0) is true, since we checked that 
explicitly in the base case of the proof. In the induction step we showed that 
Vn E N(P (n) > P(n + 1)), so we know that for every natural number n, 


P(n) > P(n + 1). For example, plugging in n = 0 we can conclude that P(0) 
> P(1). But now we know that both P(O) and P(O) > P(1) are true, so 
applying modus ponens we can conclude that P(1) is true too. Similarly, 
plugging in n = 1 in the induction step we get P(1) > P(2), so applying 
modus ponens to the statements P(1) and P(1) > P(2) we can conclude that 
P(2) is true. Setting n = 2 in the induction step we get P(2) > P(3), so by 
modus ponens, P(3) is true. Continuing in this way, you should be able to 
see that by repeatedly applying the induction step you can show that P(n) 
must be true for every natural number n. In other words, the proof really 
does show that Vn E N P(n). 


As we saw in the last example, the hardest part of a proof by 
mathematical induction is usually the induction step, in which you must 
prove the statement Vn E N(P (n) > P(n + 1)). It is usually best to do this 


by letting n be an arbitrary natural number, assuming P(n) is true, and then 
proving that P(n + 1) is true. The assumption that P(n) is true is sometimes 
called the inductive hypothesis, and the key to the proof is usually to work 
out some relationship between the inductive hypothesis P(n) and the goal 
P(n + 1). 

Here’s another example of a proof by mathematical induction. 


Example 6.1.2. Prove that Vn E€ N(3 | (nè? - n)). 


Scratch work 


As usual, the base case is easy to check. The details are given in the 
following proof. For the induction step, we let n be an arbitrary natural 
number and assume that 3 | (n? — n), and we must prove that 3 | ((n + 1)? - 
(n + 1)). Filling in the definition of divides, we can sum up our situation as 
follows: 


Givens Goal 
neN Jj € Z3j = (n+ 1} — (n + 1)) 
Jk € Z(3k = n? — n) 


The second given is the inductive hypothesis, and we need to figure out 
how it can be used to establish the goal. 


According to our techniques for dealing with existential quantifiers in 
proofs, the best thing to do first is to use the second given and let k stand for 
a particular integer such that 3k = n? — n. To complete the proof we’ll need 
to find an integer j (which will probably be related to k in some way) such 
that 3j = (n + 1)° - (n + 1). We expand the right side of this equation, 
looking for some way to relate it to the given equation 3k = n? — n: 

(n+ 1)3 — (n+ 1) = n? + 3n? + 3n + l—n— | 
= (n? — n) + 3n? + 3n 
= 3k + 3n? + 3n 
=3(k + n? + n). 


It should now be clear that we can complete the proof by letting j = k + n? + 
n. As in similar earlier proofs, we don’t bother to mention j in the proof. 


Solution 


Theorem. For every natural number n, 3 | (n? - n). 


Proof. We use mathematical induction. 
Base case: If n = 0, then n? - n = 0 = 3 - 0, so 3 | (n3 - n). 
Induction step: Let n be an arbitrary natural number and suppose 3 | (n? — 
n). Then we can choose an integer k such that 3k = n° — n. Thus, 
(n+ 1) — (n+ 1) =n? +3 +3n4+1—n—-1 
= (n? —n) +3n? +3n 
= 3k +3n* +3n 
= 3(k +n? +n). 


Therefore 3 | ((n + 1) — (n+ 1)), as required. 


Once you understand why mathematical induction works, you should be 
able to understand proofs that involve small variations on the method of 
induction. The next example illustrates such a variation. In this example 
we’ll try to figure out which is larger, n* or 2". Let’s try out a few values of 
n: 


f . . € 
n n^ 2” Which is larger? 
an 


l 
| l 2 2” 
2 4 4 tie 
3 9 8 n? 
4 16 16 tie 
5 25 32 2" 
6 36 64 2” 


It’s a close race at first, but starting with n = 5, it looks like 2” is taking a 
decisive lead over n*. Can we prove that it will stay ahead for larger values 
of n? 


Example 6.1.3. Prove that Vn > 5(2" > n°). 
Scratch work 


We are only interested in proving the inequality 2” > n? for n > 5. Thus, it 
would make no sense to use n = 0 in the base case of our induction proof. 
We’ll take n = 5 as the base case for our induction rather than n = 0. Once 
we’ ve checked that the inequality holds when n = 5, the induction step will 
show that the inequality must continue to hold if, starting with n = 5, we 
repeatedly add 1 to n. Thus, it must also hold for n = 6, 7, 8, ... . In other 
words, we’ll be able to conclude that the inequality holds for all n = 5. 

The base case n = 5 has already been checked in the table. For the 
induction step, we let n > 5 be arbitrary, assume 2” > n?, and try to prove 
that 2”*! > (n + 1)?. How can we relate the inductive hypothesis to the goal? 
Perhaps the simplest relationship involves the left sides of the two 
inequalities: 2”*' = 2 - 2". Thus, multiplying both sides of the inductive 
hypothesis 2” > n? by 2, we can conclude that 2"*! > 2n?. Now compare this 
inequality to the goal, 2"*! > (n + 1)*. If we could prove that 2n? > (n + 1)’, 


then the goal would follow easily. So let’s forget about the original goal and 
see if we can prove that 2n? > (n+ 1). 

Multiplying out the right side of the new goal we see that we must prove 
that 2n? > n? +2n+1, or in other words n? > 2n+1. This isn’t hard to prove: 
Since we’ve assumed that n > 5, it follows that n? > 5n = 2n + 3n > 2n + 1. 


Solution 


Theorem. For every natural number n = 5, 2” > n?. 


Proof. By mathematical induction. 
Base case: When n = 5 we have 2" = 32 > 25 = n°. 
Induction step: Let n > 5 be arbitrary, and suppose that 2” > n?. Then 


> 2n (inductive hypothesis) 
2 
=n +n 
? m . ec 
>n~+5n (since n > 5) 
5 
= n^ + 2n + 3n 


>n*>42n4+1=(n41)*. 


Exercises 
*1. Prove that foralln E N,O+1+2+---+n=n(n+ 1)/2. 


2. Prove that for all n E N, 0° + 1? + 27 +- - - +n? = n(n + 1)(2n + 1)/6. 
*3, Prove that for all n € N, 0° + 13 + 23 +--+ +n? = [n(n + 1)/2}°. 
4. Find a formula for 1 + 3 + 5 +--+ +(2n - 1), for n > 1, and prove that 


your formula is correct. (Hint: First try some particular values of n and 
look for a pattern.) 


5. Prove that foralln E N,O0-1+1-2+2-3+---+n(n+1)=n(n+ 1) 
(n + 2)/3. 


on, 


(b) 


14. 


15. 
16. 


(b) 


17. 


Find a formula for0-1-2+1-2-34+2-3-4+---+n(n+ 1)(n + 2), 
for n E N, and prove that your formula is correct. (Hint: Compare this 
exercise to exercises 1 and 5, and try to guess the formula.) 

Find a formula for 3° +3! +32 +- - - +3", for n > 0, and prove that your 
formula is correct. (Hint: Try to guess the formula, basing your guess 
on Example 6.1.1. Then try out some values of n and adjust your guess 
if necessary.) 


Prove that for all n > 1, 


1 1 1 l l | l l l 
2 3 4 2n—1 2n n+l n42 n+3 2n 


(a) Prove that for all n E€ N, 2 | (n? + n). 
Prove that for all n € N, 6 | (n? - n). 


Prove that for all n E N, 64 | (9" - 8n - 1). 
Prove that for all n E N, 9 | (4" + 6n - 1). 
(a) Prove that for all n E N, 7” - 5” is even. 


Prove that for all n E N, 24|(2-7"-3-5" +1). 


Prove that for all integers a and b and all n E N, (a - b) | (a” — b”). 
(Hint: Let a and b be arbitrary integers and then prove by induction that 
Yn € N[(a - b) | (a" - b")]. For the induction step, you must relate a”*t 
— b™*! to a” — b”. You might find it useful to start by completing the 
following equation: a”+! — b"+! = a(a" — b”) + 2.) 

Prove that for all integers a and b and all n E€ N, (a + b) | (att + 
py, 

Prove that for all n > 10, 2” > n°. 

(a) Prove that for all n € N, either n is even or n is odd, but not both. 
Prove that, as claimed in Section 3.4, every integer is either even or 


odd, but not both. (Hint: To prove that a negative integer n is even or 
odd, but not both, apply part (a) to —n.) 


Prove that for all n > 1, 2 < 21 + 3 < 2? + 4- 23 +- - +(n + 1)2” = n2™t, 


18. (a) What’s wrong with the following proof that for every n € N, 1: 3° 
oso ee 3-44 Ont le ana 


Proof. We use mathematical induction. Let n be an arbitrary natural 
number, and suppose 1:3? +3-3! +5-32 + + -+(2n+1)3" = n3™*1, 
Then 
1-3°43-3'45.374---+(2n + 1)3" + (2n + 3)3"*! 
= n3” t+! + (2n + 3)3"+! 
= (3n + 3)3"+! 
= (n + 1)3"*?, 
as required. 
L 
(b) Find a formula for 1 - 3? + 3 -3t + 5-3? +--+ +(2n + 1)3”, and prove 
that your formula is correct. 
19. Suppose a is a real number and a < 0. Prove that for all n E N, if n is 
even then a” > 0, and if n is odd then a” < 0. 
20. Suppose a and b are real numbers and 0 <a < b. 


(a) Prove that for all n > 1, 0 < a” < b”. (Notice that this generalizes 
Example 3.1.2.) 

(b) Prove that for all n > 2,0 < %/a < Wb. 

(c) Prove that for all n > 1, ab" + ba” < a"*t + p™1, 

(d) Prove that for all n> 2, 


a+b\” a+b” 
7 ~ e 


ma 


6.2. More Examples 


We introduced mathematical induction in the last section as a method for 
proving that all natural numbers have some property. However, the 
applications of mathematical induction extend far beyond the study of the 
natural numbers. In this section we’ll look at some examples of proofs by 
mathematical induction that illustrate the wide range of uses of induction. 


Example 6.2.1. Suppose R is a partial order on a set A. Prove that every 
finite, nonempty set B S A has an R-minimal element. 


Scratch work 


You might think at first that mathematical induction is not appropriate for 
this proof, because the goal doesn’t seem to have the form Vn E N P(n). In 


fact, the goal doesn’t explicitly mention natural numbers at all! But we can 
see that natural numbers enter into the problem when we recognize that to 
say that B is finite and nonempty means that it has n elements, for some n 
E N, n 2 1. (We’ll give a more careful definition of the number of elements 


in a finite set in Chapter 8. For the moment, an intuitive understanding of 
this concept will suffice.) Thus, the goal means Vn > 1VB © A(B has n 
elements > B has a minimal element). We can now use induction to prove 
this statement. 

In the base case we will have n = 1, so we must prove that if B has one 
element, then it has a minimal element. It is easy to check that in this case 
the one element of B must be minimal. 


For the induction step we let n > 1 be arbitrary, assume that VB € A(B 
has n elements > B has a minimal element), and try to prove that VB © 
A(B has n + 1 elements > B has a minimal element). Guided by the form of 
the goal, we let B be an arbitrary subset of A, assume that B has n + 1 
elements, and try to prove that B has a minimal element. 

How can we use the inductive hypothesis to reach our goal? The 
inductive hypothesis tells us that if we had a subset of A with n elements, 
then it would have a minimal element. To apply it, we need to find a subset 
of A with n elements. Our arbitrary set B is a subset of A, and we have 
assumed that it has n + 1 elements. Thus, a simple way to produce a subset 
of A with n elements would be to remove one element from B. It is not clear 
where this reasoning will lead, but it seems to be the simplest way to make 
use of the inductive hypothesis. Let’s give it a try. 

Let b be any element of B, and let B' = B \ {b}. Then B’ is a subset of A 
with n elements, so by the inductive hypothesis, B' has a minimal element. 
This is an existential statement, so we immediately introduce a new 
variable, say c, to stand for a minimal element of B’. 


Our goal is to prove that B has a minimal element, which is also an 
existential statement, so we should try to come up with a minimal element 
of B. We only know about two elements of B at this point, b and c, so we 
should probably try to prove that one of these is a minimal element of B. 
Which one? Well, it may depend on whether one of them is smaller than the 
other according to the partial order R. This suggests that we may need to 
use proof by cases. In our proof we use the cases bRc and 7bRc. In the first 
case we prove that b is a minimal element of B, and in the second case we 
prove that c is a minimal element of B. Note that to say that something is a 
minimal element of B is a negative statement, so in both cases we use proof 
by contradiction. 


Solution 


Theorem. Suppose R is a partial order on a set A. Then every finite, 
nonempty set B € A has an R-minimal element. 


Proof. We will show by induction that for every natural number n = 1, every 
subset of A with n elements has a minimal element. 

Base case: n = 1. Suppose B © A and B has one element. Then B = {b} 
for some b €E A. Clearly =4x € B(x ~ b), so certainly -dx E€ B(xRb A x 4 
b). Thus, b is minimal. 

Induction step: Suppose n > 1, and suppose that every subset of A with n 
elements has a minimal element. Now let B be an arbitrary subset of A with 
n + 1 elements. Let b be any element of B, and let B’ = B \ {b}, a subset of A 
with n elements. By the inductive hypothesis, we can choose a minimal 
element c € B’. 

Case 1. bRc. We claim that b is a minimal element of B. To see why, 
suppose it isn’t. Then we can choose some x €E B such that xRb and x # b. 
Since x # b, x € B’. Also, since xRb and bRc, by transitivity of R it follows 
that xRc. Thus, since c is a minimal element of B’, we must have x = c. But 
then since xRb we have cRb, and we also know bRc, so by antisymmetry of 
R it follows that b = c. This is clearly impossible, since c E B' = B \ {b}. 
Thus, b must be a minimal element of B. 

Case 2. =bRc. We claim in this case that c is a minimal element of B. To 
see why, suppose it isn’t. Then we can choose some x € B such that xRc 
and x # c. Since c is a minimal element of B’, we can’t have x €E B’, so the 


only other possibility is x = b. But then since xRc we must have bRc, which 
contradicts our assumption that =bRc. Thus, c is a minimal element of B. 
L 


Note that an infinite subset of a partially ordered set need not have a 
minimal element, as we saw in part 1 of Example 4.4.5. Thus, the 
assumption that B is finite was needed in our last theorem. This theorem 
can be used to prove another interesting fact about partial orders, again 
using mathematical induction: 


Example 6.2.2. Suppose A is a finite set and R is a partial order on A. Prove 
that R can be extended to a total order on A. In other words, prove that there 
is a total order T on A such that R © T. 


Scratch work 
We’ll only outline the proof, leaving many details as exercises. The idea is 
to prove by induction that Vn E NVAVR[(A has n elements and R is a 


partial order on A) > AT (T is a total order on A and R € T)]. The induction 
step is similar to the induction step of the last example. If R is a partial 
order on a set A with n + 1 elements, then we remove one element, call it a, 
from A, and apply the inductive hypothesis to the remaining set A’ = A\{a}. 
This will give us a total order T' on A’, and to complete the proof we must 
somehow turn this into a total order T on A such that R © T. The relation T 
already tells us how to compare any two elements of A’, but it doesn’t tell us 
how to compare a to the elements of A’. This is what we must decide in 
order to define T, and the main difficulty in this step of the proof is that we 
must make this decision in such a way that we end up with R S T. Our 
resolution of this difficulty in the following proof involves choosing a 
carefully in the first place. We choose a to be an R-minimal element of A, 
and then when we define T, we make a smaller in the T ordering than every 
element of A’. We use the theorem in the last example, with B = A, to 
guarantee that A has an R-minimal element. 


Solution 


Theorem. Suppose A is a finite set and R is a partial order on A. Then there 
is a total order T on A such that R & T. 


Proof. We will show by induction on n that every partial order on a set with 
n elements can be extended to a total order. Clearly this suffices to prove 
the theorem. 


Base case: n = 0. Suppose R is a partial order on A and A has 0 elements. 
Then clearly A = R = Ø. It is easy to check that © is a total order on A (all 
required properties hold vacuously), so we are done. 

Induction step: Let n be an arbitrary natural number, and suppose that 
every partial order on a set with n elements can be extended to a total order. 
Now suppose that A has n + 1 elements and R is a partial order on A. By the 
theorem in the last example, there must be some a € A such that a is an R- 
minimal element of A. Let A’ = A \ {a} and let R' = R n (A' x A’). You are 
asked to show in exercise 1 that R' is a partial order on A’. By the inductive 
hypothesis, we can let T’ be a total order on A’ such that R' € T’. Now let T 
= T' U ({a} x A). You are also asked to show in exercise 1 that T is a total 
order on A and R & T, as required. 

L 


The theorem in the last example can be extended to apply to partial 
orders on infinite sets. For a step in this direction, see exercise 19 in Section 
8.1. 


Example 6.2.3. Prove that for all n > 3, if n distinct points on a circle are 
connected in consecutive order with straight lines, then the interior angles 
of the resulting polygon add up to (n — 2)180°. 


Solution 


Figure 6.1 shows an example with n = 4. We won’t give the scratch work 
separately for this proof. 


Figure 6.1.a + B+y+6=(4- 2)180° = 360°. 


Theorem. For all n 2 3, if n distinct points on a circle are connected in 
consecutive order with straight lines, then the interior angles of the 
resulting polygon add up to (n — 2)180". 


Proof. We use induction on n. 


Base case: Suppose n = 3. Then the polygon is a triangle, and it is well 
known that the interior angles of a triangle add up to 180°. 

Induction step: Let n be an arbitrary natural number, n > 3, and assume 
the statement is true for n. Now consider the polygon P formed by 
connecting some n + 1 distinct points A;, A>, ..., Ap+1 on a circle. If we 
skip the last point A,,,,, then we get a polygon P’ with only n vertices, and 
by the inductive hypothesis the interior angles of this polygon add up to (n 
— 2)180°. But now as you can see in Figure 6.2, the sum of the interior 
angles of P is equal to the sum of the interior angles of P’ plus the sum of 
the interior angles of the triangle A, A, A,,,. Since the sum of the interior 
angles of the triangle is 180°, we can conclude that the sum of the interior 
angles of P is 


(n — 2)180° + 180° = ((n + 1) — 2)180°, 


as required. 


A 
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Figure 6.2. 


Example 6.2.4. Prove that for any positive integer n, a 2” x 2” square grid 
with any one square removed can be covered with L-shaped tiles that look 
like this: Hb, 


Scratch work 


Figure 6.3 shows an example for the case n = 2. In this case 2” = 4, so we 
have a 4 x 4 grid, and the square that has been removed is shaded. The 
heavy lines show how the remaining squares can be covered with five L- 
shaped tiles. 


(a) 4 x 4 grid with one (b) Grid covered with 
square removed. L-shaped tiles. 


Figure 6.3. 


We’ll use induction in our proof, and because we’re only interested in 
positive n, the base case will be n = 1. In this case we have a 2 x 2 grid with 
one square removed, and this can clearly be covered with one L-shaped tile. 
(Draw a picture!) 

For the induction step, we let n be an arbitrary positive integer and 
assume that a 2” x2” grid with any one square removed can be covered with 
L-shaped tiles. Now suppose we have a 2”*! x 2”*! orid with one square 
removed. To use our inductive hypothesis we must somehow relate this to 
the 2” x 2" grid. Since 2"*! = 2"- 2, the 2"*! x 2"*1 grid is twice as wide and 
twice as high as the 2” x 2” grid. In other words, by dividing the 2"*! x 2"*1 
grid in half both horizontally and vertically, we can split it into four 2” x 2” 
“subgrids.” This is illustrated in Figure 6.4. The one square that has been 
removed will be in one of the four subgrids; in Figure 6.4, it is in the upper 
right. 


Qn gn 
ee 
an 
on 
Figure 6.4. 


The inductive hypothesis tells us that it is possible to cover the upper 
right subgrid in Figure 6.4 with L-shaped tiles. But what about the other 
three subgrids? It turns out that there is a clever way of placing one tile on 
the grid so that the inductive hypothesis can then be used to show that the 
remaining subgrids can be covered. See if you can figure it out before 
reading the answer in the following proof. 


Solution 


Theorem. For any positive integer n, a 2” x 2” square grid with any one 
square removed can be covered with L-shaped tiles. 


Proof. We use induction on n. 


Base case: Suppose n = 1. Then the grid is a 2 x 2 grid with one square 
removed, which can clearly be covered with one L-shaped tile. 

Induction step: Let n be an arbitrary positive integer, and suppose that a 
2" x 2” grid with any one square removed can be covered with L-shaped 
tiles. Now consider a 2”*! x 2"*1 grid with one square removed. Cut the 
grid in half both vertically and horizontally, splitting it into four 2” x 2” 
subgrids. The one square that has been removed comes from one of these 
subgrids, so by the inductive hypothesis the rest of this subgrid can be 
covered with L-shaped tiles. To cover the other three subgrids, first place 
one L-shaped tile in the center so that it covers one square from each of the 
three remaining subgrids, as illustrated in Figure 6.5. The area remaining to 
be covered now contains every square except one in each of the subgrids, so 
by applying the inductive hypothesis to each subgrid we can see that this 
area can be covered with tiles. 

L 


Figure 6.5. 


It is interesting to note that this proof can actually be used to figure out 
how to place tiles on a particular grid. For example, consider the 8 x 8 grid 
with one square removed shown in Figure 6.6. 


Figure 6.6. 


According to the preceding proof, the first step in covering this grid with 
tiles is to split it into four 4 x 4 subgrids and place one tile in the center, 
covering one square from each subgrid except the upper left. This is 
illustrated in Figure 6.7. The area remaining to be covered now consists of 
four 4 x 4 subgrids with one square removed from each of them. 

How do we cover the remaining 4 x 4 subgrids? By the same method, of 
course! For example, let’s cover the subgrid in the upper right of Figure 6.7. 
We need to cover every square of this subgrid except the lower left corner, 
which has already been covered. We start by cutting it into four 2 x 2 
subgrids and put one tile in the middle, as in Figure 6.8. The area remaining 
to be covered now consists of four 2 x 2 subgrids with one square removed 
from each. Each of these can be covered with one tile, thus completing the 
upper right subgrid of Figure 6.7. 


Figure 6.7. 


Figure 6.8. 


The remaining three quarters of Figure 6.7 are completed by a similar 
procedure. The final solution is shown in Figure 6.9. 


Figure 6.9. 


The method we used in solving this problem is an example of a recursive 
procedure. We solved the problem for an 8x8 grid by splitting it into four 
4x4 grid problems. To solve each of these, we split it into four 2x2 
problems, each of which was easy to solve. If we had started with a larger 
grid, we might have had to repeat the splitting many times before reaching 
easy 2 x 2 problems. Recursion and its relationship to mathematical 
induction are the subject of our next section. 


Exercises 


*1. Complete the proof in Example 6.2.2 by doing the following proofs. 
(We use the same notation here as in the example.) 


(a) 
(b) 
2. 


(b) 


Prove that R’ is a partial order on A’. 
Prove that T is a total order on A and R € T. 


Suppose R is a partial order on a set A, B © A, and B is finite. Prove that 
there is a partial order T on A such that R © T and Vx E BVy E A(xT y 
V yT x). Note that, in particular, if A is finite we can let B = A, and the 
conclusion then means that T is a total order on A. Thus, this gives an 
alternative approach to the proof of the theorem in Example 6.2.2. 
(Hint: Use induction on the number of elements in B. For the induction 
step, assume the conclusion holds for any set B © A with n elements, 
and suppose B is a subset of A with n + 1 elements. Let b be any 
element of B and let B' = B \{b}, a subset of A with n elements. By the 
inductive hypothesis, let T' be a partial order on A such that R © T' and 
Vx € B' Vy E A(T y V yT x). Now let A, = {x E A | (x, b) E T'} and 
Ay = A \ Aj, and let T= T' U (A, x Ap). Prove that T has all the required 
properties. ) 


Suppose R is a total order on a set A. Prove that every finite, nonempty 

set B € A has an R-smallest element and an R-largest element. 

(a) Suppose R is a relation on A, and Vx E AVy €E A(xRy V yRx). 
(Note that this implies that R is reflexive.) Prove that for every 
finite, nonempty set B © A there is some x E B such that Vy © 
B((x, y) © R ° R). (Hint: Imitate Example 6.2.1.) 


Consider a tournament in which each contestant plays every other 
contestant exactly once, and one of them wins. We’ll say that a 
contestant x is excellent if, for every other contestant y, either x beats y 
or there is a third contestant z such that x beats z and z beats y. Prove 
that there is at least one excellent contestant. 


For each n € N, let F, = 2@ + 1. (These numbers are called the 


Fermat numbers, after the French mathematician Pierre de Fermat 
(1601-1665). Fermat showed that Fo, F4, F>, F3, and F; are prime, and 


conjectured that all of the Fermat numbers are prime. However, over 
100 years later Euler showed that F, is not prime. It is not known if 


there is any n > 4 for which F, is prime.) 
Prove that for all n = 1, F, = (Fo: Fy: Fo ++ Faaa) + 2. 


*8. 


(a) 


(b) 


(c) 


Prove that if n > 1 and aj, av, . . . , a, are any real numbers, then |a, + a> 
+++++ a, | < la| + la| +--+ + jal. (Note that this generalizes the 
triangle inequality; see exercise 13(c) of Section 3.5.) 

(a) Prove that if a and b are positive real numbers, then a/b + b/a > 2. 

(Hint: Start with the fact that (a — b)? > 0.) 

Suppose that a, b, and c are real numbers and 0 < a < b < c. Prove that 
b/c + c/a — b/a > 1. (Hint: Start with the fact that (c — a)(c — b) = 0.) 
Prove that if n > 2 and aj, d5,..., a, are real numbers such that 0 < a, 


<d) <`: < ap then a,/dy +a /a3 + +++ + a,_,/a, +a,/a, = N. 


If n = 2 and aj, a>, . . . , a, is a list of positive real numbers, then the 
number (a; + a +: ++ + a,)/n is called the arithmetic mean of the 
numbers a4, d>,..., Ap and the number ¥/ajaz---a, is called their 


geometric mean. In this exercise you will prove the arithmetic mean— 
geometric mean inequality, which says that the arithmetic mean is 
always at least as large as the geometric mean. 


Prove that the arithmetic mean—geometric mean inequality holds for 
lists of numbers of length 2. In other words, prove that for all positive 
real numbers a and b, (a + b)/2 > Jab. 

Prove that the arithmetic mean—geometric mean inequality holds for 
any list of numbers whose length is a power of 2. In other words, prove 
that for all n > 1, if ay, do, . . . , dyn is a list of positive real numbers, 
then 


a, +a? +- +a = 
m > * 


> ajar: 
an 


- Qn, 


Suppose that ng = 2 and the arithmetic mean—geometric mean 
inequality fails for some list of length nọ. In other words, there are 
positive real numbers dj, d>,..., an, such that 


ai ta:t: tan 


no 


no 
< Yala- Ang- 


Prove that for all n = no, the arithmetic mean-geometric mean 
inequality fails for some list of length n. 


(d) 


10. 


(b) 


11. 


12. 


13. 


Prove that the arithmetic mean—geometric mean inequality always 
holds. 


Prove that if n > 2 and aj, a>, ..., a, is a list of positive real numbers, 
then 
n 
i i i < X/a|d2°+++dy. 


a] An 


(Hint: Apply exercise 8. The number on the left side of the inequality 

above is called the harmonic mean of the numbers aj, a>, . . . , ap.) 

(a) Prove that if a,, a>, b4, and b, are real numbers, with a; < a> and 
b; < bo, then a, by + ay by < a, by + a, bo. 

Suppose that n is a positive integer, a4, a2, . . . , a„ and bj, b5,..., bp 

are real numbers, a; < dy <°*:<a,,b, <b) < + + + < bp and f is a one- 

to-one, onto function from {1, 2,..., n} to {1,2,..., n}. Prove that 

aı bia) + az bio) + `` +an bin) < a1 by + Ay by + + + +a, bp. (This fact is 

known as the rearrangement inequality.) 

Prove that for every set A, if A has n elements then “(A) has 2” 
elements. 

If A is a set, let A, (A) be the set of all subsets of A that have exactly 

two elements. Prove that for every set A, if A has n elements then 7, 

(A) has n(n — 1)/2 elements. (Hint: See the solution for exercise 11.) 


Suppose n is a positive integer. An equilateral triangle is cut into 4” 
congruent equilateral triangles by equally spaced line segments parallel 
to the sides of the triangle, and one comer is removed. (Figure 6.10 
shows an example in the case n = 2.) Show that the remaining area can 
be covered by trapezoidal tiles like this: AZ. 


Figure 6.10. 


14. 


15. 


16. 


Let n be a positive integer. Suppose n chords are drawn in a circle in 
such a way that each chord intersects every other, but no three intersect 
at one point. Prove that the chords cut the circle into (n? + n + 2)/2 
regions. (Figure 6.11 shows an example in the case n = 4. Note that 
there are (4? + 4 + 2)/2 = 11 regions in this figure.) 


Figure 6.11. 


Let n be a positive integer, and suppose that n chords are drawn in a 
circle in any way, cutting the circle into a number a regions. Prove that 
the regions can be colored with two colors in such a way that adjacent 
regions (that is, regions that share an edge) are different colors. (Figure 
6.12 shows an example in the case n = 4.) 


Figure 6.12. 


Prove that for every finite set A and every function f: A > A, if f is one- 
to-one then f is onto. (Hint: Use induction on the number of elements in 
A. For the induction step, assume the conclusion holds for any set A 
with n elements, and suppose that A has n + 1 elements and f: A > A. 
Suppose f is one-to-one but not onto. Then there is some a € A such 
that a /€ Ran/(f). Let A' = A \ {a} and f = f n (A' x A’). Show that f: A’ 


17. 


18. 


> A', f is one-to-one, and f is not onto, which contradicts the inductive 
hypothesis.) 
What’s wrong with the following proof that if A © N and 0 €E A then A 


=N? 


Proof. We will prove by induction that Vn E N(n € A). 


Base case: If n = 0, then n E A by assumption. 
Induction step: Let n € N be arbitrary, and suppose that n € A. 


Since n was arbitrary, it follows that every natural number is an 


element of A, and therefore in particular n + 1 € A. 
O 


Suppose f: R > R. What’s wrong with the following proof that for 
every finite, nonempty set A S R there is a real number c such that Vx 
E A(f (x)=c)? 
Proof. We will prove by induction that for every n > 1, if A is any 
subset of R with n elements then dc E RVx € A(f (x) = c). 
Base case: n = 1. Suppose A € R and A has one element. Then A = 
{a}, for some a €E R. Let c = f(a). Then clearly Vx E A(f (x) = c). 


Induction step: Suppose n 2 1, and for all A S R, if A has n elements 
then dc E RVx E A(f (x) = c). Now suppose A S R and A has n+ 1 
elements. Let a, be any element of A, and let A, = A \ {a,}. Then A, 
has n elements, so by the inductive hypothesis there is some c} € R 
such that Vx € A, (f (x) = c4). If we can show that f(a,) = c then we 
will be done, since then it will follow that Vx E A(f (x) = c4). 


Let a, be an element of A that is different from a4, and let A, = A \ 
{dy}. Applying the inductive hypothesis again, we can choose a 
number cy E R such that Vx E A, (f (x) = c). Notice that since a, # 
Ay, a, © A», so f(a) = c. Now let a3 be an element of A that is 
different from both a, and ay. Then a3 E A, and a3 E Ap, so f(a3) = c, 
and f(a3) = c>. Therefore c4 = c3, so f(a,) = c4, as required. 


6.3. Recursion 


In Chapter 3 we learned to prove statements of the form VnP (n) by letting 
n be arbitrary and proving P(n). In this chapter we’ve learned another 
method for proving such statements, when n ranges over the natural 
numbers: prove P(0), and then prove that for any natural number n, if P(n) 
is true then so is P(n + 1). Once we have proven these statements, we can 
run through all the natural numbers in order and see that P must be true of 
all of them. 

We can use a similar idea to introduce a new way of defining functions. 
In Chapter 5, we usually defined a function f by saying how to compute f(n) 
for any n in the domain of f. If the domain of f is the set of all natural 
numbers, an alternative method to define f would be to say what f(0) is and 
then, for any natural number n, say how we could compute f(n + 1) if we 
already knew the value of f(n). Such a definition would enable us to run 
through all the natural numbers in order computing the image of each one 
under f. 

For example, we might use the following equations to define a function f 
with domain N: 


f(O) = 1; 
for every n € N, f(n +1) = (n+ 1)- f(n). 


The second equation tells us how to compute f(n + 1), but only if we 
already know the value of f(n). Thus, although we cannot use this equation 
to tell us directly what the image of any number is under f, we can use it to 
run through all the natural numbers in order and compute their images. 

We start with f(0), which we know from the first equation is equal to 1. 
Plugging in n = 0 in the second equation, we see that f(1) = 1-f(0)=1-1= 
1, so we’ve determined the value of f(1). But now that we know that f(1) = 
1, we can use the second equation again to compute f(2). Plugging in n = 1 
in the second equation, we find that f(2) = 2 - f(1) = 2: 1 = 2. Similarly, 
setting n = 2 in the second equation we get f(3) = 3-f(2) = 3:2 = 6. 
Continuing in this way we can compute f(n) for any natural number n. Thus, 


the two equations really do give us a rule that determines a unique value 
f(n) for each natural number n, so they define a function f with domain N. 
Definitions of this kind are called recursive definitions. 

Sometimes we’ll work backwards when using a recursive definition to 
evaluate a function. For example, suppose we want to compute f(6), where f 
is the function just defined. According to the second equation in the 
definition of f, f(6) = 6 - f(5), so to complete the calculation we must 
compute f(5). Using the second equation again, we find that f(5) = 5 - f(4), 
so we must compute f(4). Continuing in this way leads to the following 
calculation: 

f(6) =6- f(5) 
=6.5. 
= 6.5. 
=6.5.4 
=6-5-4-3-2- f(1) 
5-4-3-2-1- f(0) 
5-4-3-2-1-1 


=6-;5 
=(6:: 
= 720. 


Perhaps now you recognize the function f. For any positive integer n, f(n) 
=n-(n-1):(n- 2)--- 1, and f(0) = 1. The number f(n) is called n 
factorial, and is denoted n!. (Recall that we used this notation in our proof 
of Theorem 3.7.3.) For example, 6! = 720. Often, if a function can be 
written as a formula with an ellipsis (. . .) in it, then the use of the ellipsis 
can be avoided by giving a recursive definition for the function. Such a 
definition is usually easier to work with. 

Many familiar functions are most easily defined using recursive 
definitions. For example, for any number a, we could define a” with the 
following recursive definition: 


for every n € N,a"t! =a" -a. 


Using this definition, we would compute a‘* like this: 


=a -a-a 
= q4 -a:a:a@ 
= 4a -a:a:a:a 
=]-a-a-a-a. 
For another example, consider the sum 2° + 2! + 22 +--+ + 2", which 
appeared in the first example of this chapter. The ellipsis suggests that we 


might be able to use a recursive definition. If we let f(n) = 2° +2! +22 +: - -+ 
2”, then notice that for every n € N, f(nt1) = 2° +2! +22 + - +2" 421 = 


f(n) + 2"*t, Thus, we could define f recursively as follows: 
f() =2°=1; 
for every n € N, f(n +1) = f(n) + grn 
As a check that this definition is right, let’s try it out in the case n = 3: 


f3) = f(2)+2° 
= f(1) +2? +2? 
= f0) +2! +2 +2? 
=28 42 42442 
= 15. 


Sums such as the one in the last example come up often enough that there 
is a special notation for them. If do, d4, . . . , a, is a list of numbers, then the 
sum of these numbers is written }`ṣ_oọa;. This is read “the sum as i goes 
from 0 to n of a;.” For example, we can use this notation to write the sum in 
the last example: 


n 
52 = 2° +2! 2 p.o. 
i=0 

More generally, if n > m, then 


n 


` Ai = Am + Gm4+i + Am42 +++ +n. 


i=m 


For example, 


6 
Yo? =P 44 4+5° +67 = 94+ 16+ 25 +36 = 86. 


i=3 


The letter i in these formulas is a bound variable and therefore can be 
replaced by a new variable without changing the meaning of the formula. 
Now let’s try giving a recursive definition for this notation. We let m be 
an arbitrary integer, and then proceed by recursion on n. Just as the base 
case for an induction proof need not be n = 0, the base for a recursive 
definition can also be a number other than 0. In this case we are only 
interested in n > m, so we take n = mas the base for our recursion: 


m 


) a; = Ams 


i=m 
n+l n 
for every n > m, ) q= Qj + Gn+1. 


i=m i=m 


Trying this definition out on the previous example, we get 


6 5 

2? 2 > 
) “= ) 1 +6 
i=3 i=3 


4 

a? | f 

= ) “+ 3° + 0° 
i=3 


=P 44946 
i=3 


= 37447457 +67, 


just as we wanted. 


Clearly induction and recursion are closely related, so it shouldn’t be 
surprising that if a concept has been defined by recursion, then proofs 
involving this concept are often best done by induction. For example, in 
Section 6.1 we saw some proofs by induction that involved summations and 
exponentiation, and now we have seen that summations and exponentiation 


can be defined recursively. Because the factorial function can also be 
defined recursively, proofs involving factorials also often use induction. 


Example 6.3.1. Prove that for every n = 4, n! > 2”. 
Scratch work 


Because the problem involves factorial and exponentiation, both of which 
are defined recursively, induction seems like a good method to use. The 
base case will be n = 4, and it is just a matter of simple arithmetic to check 
that the inequality is true in this case. For the induction step, our inductive 
hypothesis will be n! > 2", and we must prove that (n + 1)! > 2"*!. Of 
course, the way to relate the inductive hypothesis to the goal is to use the 
recursive definitions of factorial and exponentiation, which tell us that (n + 
1)! = (n+ 1) - n! and 2"*! = 2"- 2, Once these equations are plugged in, the 
rest is fairly straightforward. 


Solution 
Theorem. For every n 2 4, n! > 2”. 


Proof. By mathematical induction. 
Base case: When n = 4 we have n! = 24 > 16 = 2”. 
Induction step: Let n > 4 be arbitrary and suppose that n! > 2”. Then 


(n+ 1)!=(n+1)-n! 
> (n+1)-2" (inductive hypothesis) 


> 2.9" — pti 


O 


Example 6.3.2. Prove that for every real number a and all natural numbers 


mandn,a™"=q™- a”. 


Scratch work 


There are three universal quantifiers here, and we’ll treat the first two 
differently from the third. We let a and m be arbitrary and then use 


mathematical induction to prove that Vn E N(a™” = a™ - a"). The key 


algebraic fact in the induction step will be the formula a™*! = a” - a from the 
recursive definition of exponentiation. 


Solution 


Theorem. For every real number a and all natural numbers m and n, a™"" 
=q”; q”. 


Proof. Let a be an arbitrary real number and m an arbitrary natural number. 
We now proceed by induction on n. 
Base case: When n = 0, we have a™” = q™t9 = gM = q™ - 1 = a™ - af =q™ 
"a. 
Induction step. Suppose a™™” = a™ - a”. Then 
a™te+1) = al” +tn)+ 


=a"™.q (definition of exponentiation) 


A 
~ 
al 
~ 
al 
~ 


(inductive hypothesis) 
(definition of exponentiation). 
O 
Example 6.3.3. A sequence of numbers dp, 44, A>, . . . is defined recursively 
as follows: 
ag = 0: 
for every n € N, dn4) = 2a, + 1. 


Find a formula for a, and prove that your formula is correct. 


Scratch work 


It’s probably a good idea to start out by computing the first few terms in the 
sequence. We already know dy = 0, so plugging in n = O in the second 


equation we get a, = 2a, + 1 = 0 + 1 = 1. Thus, plugging in n = 1, we get a, 
= 2a, + 1=2+1=3. Continuing in this way we get the following table of 
values: 


Aha! The numbers we’re getting are one less than the powers of 2. It 
looks like the formula is probably a, = 2” — 1, but we can’t be sure this is 


right unless we prove it. Fortunately, it is fairly easy to prove the formula 
by induction. 


Solution 
Theorem. If the sequence ap, aj, do, . . . is defined by the recursive 


definition given earlier, then for every natural number n, a, = 2" — 1. 


Proof. By induction. 
Base case: dy = 0 = 2° - 1. 


Induction step: Suppose a, = 2” — 1. Then 


Qn+1 = 2an + | (definition of ay+1) 
= 2(2"—1)+ 1 (inductive hypothesis) 
= I+ —2 + | — oti = 


L 
We end this section with a rather unusual example. We’ll prove that for 
every real number x > —1 and every natural number n, (1 + x)" > nx. A 
natural way to proceed would be to let x > -1 be arbitrary, and then use 
induction on n. In the induction step we assume that (1 + x)” > nx, and then 
try to prove that (1 + x)"*! > (n + 1)x. Because we’ve assumed x > —1, we 
have 1 + x > 0, so we can multiply both sides of the inductive hypothesis (1 
+ x)" > nx by 1 + x to get 


(1+x)"t! = (14+x)(1 +)" 
> (l+x)nx 
= nx + nx’, 
But the conclusion we need for the induction step is (1 + x)"*! > (n + 1)x, 


and it’s not clear how to get this conclusion from the inequality we’ve 
derived. 


Our solution to this difficulty will be to replace our original problem with 
a problem that appears to be harder but is actually easier. Instead of proving 
the inequality (1 + x)" > nx directly, we’ll prove (1 + x)" > 1 + nx, and then 
observe that since 1 + nx > nx, it follows immediately that (1 + x)" > nx. 
You might think that if we had difficulty proving (1 + x)" > nx, we’ll surely 
have more difficulty proving the stronger statement (1 + x)” > 1 + nx. But it 
turns out that the approach we tried unsuccessfully on the original problem 
works perfectly on the new problem! 


Theorem 6.3.4. For every x >—1 and every natural number n, (1+x)" > nx. 


Proof. Let x > —1 be arbitrary. We will prove by induction that for every 
natural number n, (1 + x)” > 1 + nx, from which it clearly follows that (1 + 
x)" > nx. 
Base case: If n = 0, then (1 + x)"=(1+x)9=1=1+0=1+nmx. 
Induction step: Suppose (1 + x)" > 1 + nx. Then 


(a +H = (14+x)(1 +x)" 
> (1+x)(1+nx) (inductive hypothesis) 
=] +x +nx +n? 
>1+(n+1)x (since nx? > 0). 
L 
Exercises 
*1. Find a formula for }7"_, ah and prove that your formula is correct. 


2. Prove that for all n= 1, 


n P] 
S” l n“ + 3n 
— i(i + 1) +2) ~ 4(n + 1)(n +2) 


3. Prove that for all n 2 2, 


*6. 


(b) 


*8. 


3 l _ 3n?—n-2 
(i—1Xi+1)  4na(n+1l)` 


Prove that for all n E N, 


Yoi +12 = (n + 1)(2n + 1)(2n + 3) 
. 3 
i=0 
Suppose r is areal number and r ~ 1. Prove that for all n € N, 


n 


i=0 


(Note that this exercise generalizes Example 6.1.1 and exercise 7 of 
Section 6.1.) 
Prove that for all n > 1, 


(a) Suppose do, dj, dy, ..., a, and bọ, b4, bo, . . . , b, are two 
sequences of real numbers. Prove that 


n 


Sia + b;) = Xai + hy. 


i=0 i=0 i=0 


Suppose c is a real number and dp, aj, .. . , a, is a Sequence of real 
numbers. Prove that 


n n 
C- ) d; = ) (c-ai). 


i=0 i=0 


The harmonic numbers are the numbers H, for n = 1 defined by the 
formula 


n 
l 
Hn = ) rare 
: L 

i=l 


(a) 


(b) 


(b) 


15. 


16. 


Prove that for all natural numbers n and m, ifn > m > 1 then H, - Hm = 


(n — m)/n. (Hint: Let m be an arbitrary natural number with m > 1 and 
then proceed by induction on n, with n = m as the base case of the 
induction.) 


Prove that for all n > 0, Hon > 1 + n/2. 
(For those who have studied calculus.) Show that lim, Hp = œ, so 
yo (1/1) diverges. 


Let H, be defined as in exercise 8. Prove that for all n = 2, 


n=l 


2 H; = nHn — n. 
k=l 


Find a formula for } `}; (i - (i !)) and prove that your formula is correct. 


Find a formula for ¥*_ọ(i/(i + 1)!) and prove that your formula is 
correct. 


(a) Prove that for all n E N, 2" > n. 
Prove that for all n > 9, n! > (2°. 
Prove that for all n € N, n! < 2), 
Suppose k is a positive integer. 

Prove that for all n € N, (k? + n)! > k”. 


Prove that for all n > 2k, n! > k”. (Hint: Use induction, and for the base 
case use part (a). Note that in the language of exercise 19 of Section 
5.1, this shows that if f(n) = k” and g(n) = n!, then f E O(g).) 
Prove that for every real number a and all natural numbers m and n, 
(a™)” = qmn 
A sequence dp, 44, A>, . . . is defined recursively as follows: 

a= 0; 


for every n € N, an1 = 2an +n. 


Prove that for all n E N, a, = 2” - n- 1. 


A sequence ap, 44, A», . . . is defined recursively as follows: 


17. 


18. 


(d) 


äg = 2: 


5 


for every n € N, ay4) = (ay)*. 


Find a formula for a,, and prove that your formula is correct. 
A sequence 44, A>, A3, . . . is defined recursively as follows: 
ay = l; 


an + | 


for every n = 1, &n+1 = 


Find a formula for a, and prove that your formula is correct. 


For n = k > 0, the quantity (/) is defined as follows: 


n 
k 


i) n! 
k) ki-(n—k)! 


Prove that for all n € N, (5) = (") = 1. 


Prove that for all natural numbers n and k, if n > k > 0 then ("{') = 
(x) + (r-r): 
If A is a set and k E N, let Y,; (A) be the set of all subsets of A that 
have k elements. Prove that if A has n elements and n= k= 0, then 2% 
(A) has ({) elements. (Hint: Prove by induction that Vn E NVA[A is a 
set with n elements >Vk(n > k > 0 > 2P, (A) has (%) elements)]. 
Imitate exercises 11 and 12 of Section 6.2. In fact, this exercise 
generalizes exercise 12 of Section 6.2. This exercise shows that (%) is 
the number of ways of choosing k elements out of a set of size n, so it 
is sometimes called n choose k.) 
Prove that for all real numbers x and y and every natural number n, 
. EV i n n=—k ok 
(x + y) -Dh y. 


n 


(This is called the binomial theorem, so the numbers (%) are sometimes 
called binomial coefficients.) 

Note: Parts (a) and (b) show that we can compute the numbers (%) 
conveniently by using a triangular array as in Figure 6.13. This array is 
called Pascal’s triangle, after the French mathematician Blaise Pascal 


19. 
(a) 


(b) 
20. 


21. 


(1623-1662). Each row of the triangle corresponds to a particular value 
of n, and it lists the values of (/') for all k from 0 to n. Part (a) shows 
that the first and last numbers in every row are 1. Part (b) shows that 
every other number is the sum of the two numbers above it. For 
example, the lines in Figure 6.13 illustrate that (;) = 3 is the sum of 
(?) = 2and (3) =1. 


n=0: l 
n=l: | 


n=2: 1 2 1 
NZ 
n=3: 1 33 1 


n=4 1464 1 
Figure 6.13. Pascal’s triangle. 


For the meaning of the notation used in this exercise, see exercise 18. 
Prove that for all n € N, X k-o (4) = 2”. (Hint: You can do this by 
induction using parts (a) and (b) of exercise 18, or you can combine 
part (c) of exercise 18 with exercise 11 of Section 6.2, or you can plug 
something in for x and y in part (d) of exercise 18.) 


Prove that for all n > 1, Eg-o(— 14 (X) = 0. 


A sequence dp, 44, a>, . . . is defined recursively as follows: 
ag = 0; 


. a 4 
for every n € N, dna) = (an) + rt 


Prove that for alln>1,0<a,<1. 
In this problem we will define, for each natural number n, a function f: 
Z* > Z*. The sequence of functions fo, fi» fo, . . . is defined recursively 
as follows: 

for every x € Z*, fo(x) = x; 


for every n € N and every x € Z*, fai (x) = 2/"™, 


(a) The first equation in this recursive definition gives a formula for fọ (x), 
namely fọ (x) = x. Find formulas for f, (x), fə (x), and fz (x). 

(b) Prove that for all natural numbers n and all positive integers x and y, if 
x < y then fn (x) < fn 0). 

(c) Prove that for all natural numbers m and n and all positive integers x, if 
m <n then fn (x) < f (xX). 

(d) Prove that for every natural number n, f, E O(f,.,) but fri; /E O(f,,)- 


(See exercise 19 in Section 5.1 for the meaning of the notation used 
here.) 


Now define g: Z* > Z* by the formula g(x) = f, (x). 


(e) Compute g(1), g(2), and g(3). (Do not try to compute g(4); the answer 
would be a number with more than 6 x 10197% digits.) 
(f) Prove that for every natural number n, f, E O(g) but g /€ O(f,). 


22. Explain the paradox in the proof of Theorem 6.3.4, in which we made 
the proof easier by changing the goal to a statement that looked like it 
would be harder to prove. 


6.4. Strong Induction 


In the induction step of a proof by mathematical induction, we prove that a 
natural number has some property based on the assumption that the 
previous number has the same property. In some cases this assumption isn’t 
strong enough to make the proof work, and we need to assume that all 
smaller natural numbers have the property. This is the idea behind a variant 
of mathematical induction sometimes called strong induction: 


To prove a goal of the form Vn € N P(n): 


Prove that Wn[(Wk < n P (k)) > P(n)], where both n and k range over the 
natural numbers in this statement. Of course, the most direct way to prove 
this is to let n be an arbitrary natural number, assume that Vk < nP(k), and 
then prove P(n). 


Note that no base case is necessary in a proof by strong induction. All 
that is needed is a modified form of the induction step in which we prove 


that if every natural number smaller than n has the property P, then n has 
the property P. In a proof by strong induction, we refer to the assumption 
that every natural number smaller than n has the property P as the inductive 
hypothesis. 

To see why strong induction works, it might help if we first review 
briefly why ordinary induction works. Recall that a proof by ordinary 
induction enables us to go through all the natural numbers in order and see 
that each of them has some property P. The base case gets the process 
started, and the induction step shows that the process can always be 
continued from one number to the next. But note that in this process, by the 
time we check that some natural number n has the property P, we’ve 
already checked that all smaller numbers have the property. In other words, 
we already know that Vk < nP(k). The idea behind strong induction is that 
we should be allowed to use this information in our proof of P(n). 

Let’s work out the details of this idea more carefully. Suppose that we 
have followed the strong induction proof strategy, and we’ve proven the 
statement Vn[(Wk < n P (k)) > P(n)]. Then, plugging in 0 for n, we can 
conclude that (Vk < 0 P(k)) > P(O). But because there are no natural 
numbers smaller than 0, the statement Wk < 0 P(k) is vacuously true. 
Therefore, by modus ponens, P(0) is true. (This explains why the base case 
doesn’t have to be checked separately in a proof by strong induction; the 
base case P(0) actually follows from the modified form of the induction 
step used in strong induction.) Similarly, plugging in 1 for n we can 
conclude that (Vk < 1 P(k)) > P(1). The only natural number smaller than 
1 is 0, and we’ve just shown that P(0) is true, so the statement Vk < 1 P(k) 
is true. Therefore, by modus ponens, P(1) is also true. Now plug in 2 for n 
to get the statement (Vk < 2 P(k)) > P(2). Since P(O) and P(1) are both 
true, the statement Vk < 2 P(k) is true, and therefore by modus ponens, P(2) 
is true. Continuing in this way we can show that P(n) is true for every 
natural number n, as required. For an alternative justification of the method 
of strong induction, see exercise 1. 

As our first example of the method of strong induction, we prove an 
important fact of number theory known as the division algorithm.! 


Theorem 6.4.1. (Division algorithm) For all natural numbers n and m, if m 
> 0 then there are natural numbers q and r such that n = qm + r andr < m. 


(The numbers q and r are called the quotient and remainder when n is 
divided by m.) 


Scratch work 


We let m be an arbitrary positive integer and then use strong induction to 
prove that Vndgdr(n = qm +r A r< m). According to the description of 
strong induction, this means that we should let n be an arbitrary natural 
number, assume that Vk < ndgdr(k = qm + r A r < m), and prove that 
dgdr(n = qm+rAr<m). 

Our goal is an existential statement, so we should try to come up with 
values of q and r with the required properties. If n < m then this is easy 
because we can just let q = 0 and r =n. But if n > m, then this won’t work, 
since we must have r < m, so we must do something different in this case. 
As usual in induction proofs, we look to the inductive hypothesis. The 
inductive hypothesis starts with Vk < n, so to apply it we should plug in 
some natural number smaller than n for k, but what should we plug in? The 
reference to division in the statement of the theorem provides a hint. If we 
think of division as repeated subtraction, then dividing n by m involves 
subtracting m from n repeatedly. The first step in this process would be to 
compute n — m, which is a natural number smaller than n. Perhaps we 
should plug in n — m for k. It’s not entirely clear where this will lead, but it’s 
worth a try. In fact, as you’ll see in the proof, once we take this step the 
desired conclusion follows almost immediately. 

Notice that we are using the fact that a quotient and remainder exist for 
some natural number smaller than n to prove that they exist for n, but this 
smaller number is not n-1, it’s n — m. This is why we’re using strong 
induction rather than ordinary induction for this proof. 


Proof. We let m be an arbitrary positive integer and then proceed by strong 
induction on n. 


Suppose n is a natural number, and for every k < n there are natural 
numbers q and r such that k = qm + r andr < m. 


Case 1. n < m. Let q = 0 and r = n. Then clearly n = qm + r andr < m. 


Case 2. n> m. Letk =n -m < n and note that since n > m, k is a natural 
number. By the inductive hypothesis we can choose q' and r' such that k = 


qd m+r andr <m. Thenn-m=q'm+r,son=q m+r'+m= (qd 
+1)m+r’. Thus, if we let q = q' +1 andr =r’, then we have n = qm+r and r < 
m, as required. 

L 


The division algorithm can also be extended to negative integers n, and it 
can be shown that for every m and n the quotient and remainder q and r are 
unique. For more on this, see exercise 14. 

Our next example is another important theorem of number theory. We 
used this theorem in our proof in the introduction that there are infinitely 
many primes. We will have more to say about this theorem in Chapter 7. 


Theorem 6.4.2. Every integer n > 1 is either prime or a product of two or 
more primes. 


Scratch work 


We write the goal in the form Vn E N[n > 1 > (nis prime V n is a product 


of primes)] and then use strong induction. Thus, our inductive hypothesis is 
Vk < n[k > 1 > (kis prime V k is a product of primes)], and we must prove 
thatn > 1 > (nis prime V n is a product of primes). Of course, we start by 
assuming n > 1, and according to our strategies for proving disjunctions, a 
good way to complete the proof would be to assume that n is not prime and 
prove that it must be a product of primes. Because the assumption that n is 
not prime means dadb(n = ab A a<n A b < n), we immediately use 
existential instantiation to introduce the new variables a and b into the 
proof. Applying the inductive hypothesis to a and b now leads to the 
desired conclusion. 


Proof. We use strong induction. Suppose n > 1, and suppose that for every 
integer k, if 1 < k < n then k is either prime or a product of primes. Of 
course, if n is prime then there is nothing to prove, so suppose n is not 
prime. Then we can choose positive integers a and b such that n = ab, a < n, 
and b < n. Note that since a < n = ab, it follows that b > 1, and similarly we 
must have a > 1. Thus, by the inductive hypothesis, each of a and b is either 
prime or a product of primes. But then since n = ab, n is a product of 
primes. 

L 


The method of recursion studied in the last section also has a strong 
form. As an example of this, consider the following definition of a sequence 
of numbers, called the Fibonacci numbers. These numbers were first 
studied by the Italian mathematician Leonardo of Pisa (circa 1170-circa 
1250), who is better known by the nickname Fibonacci. 

Fo == 0; 
Fi = j: 


for every n > 2, Fy = Fn—2 + Fn-1. 


For example, plugging in n = 2 in the last equation we find that F, = Fo + 
F, =0+ 1 = 1. Similarly, F3 =F, + F) = 1 +1 = 2 and F; = F, + F}=1+2 
= 3. Continuing in this way leads to the following values: 


Note that, starting with F», each Fibonacci number is computed using, 


not just the previous number in the sequence, but also the one before that. 
This is the sense in which the recursion is strong. It shouldn’t be surprising, 
therefore, that proofs involving the Fibonacci numbers often require strong 
induction rather than ordinary induction. 

To illustrate this we’ll prove the following remarkable formula for the 


Fibonacci numbers: 
( I+v5 J _ (£) 


J5 


It is hard at first to believe that this formula is right. After all, the Fibonacci 
numbers are integers, and it is not at all clear that this formula will give an 
integer value. And what do the Fibonacci numbers have to do with /5? 
Nevertheless, a proof by strong induction shows that the formula is correct. 
(To see how this formula could be derived, see exercise 9.) 


Theorem 6.4.3. If F,, is the nth Fibonacci number, then 


Scratch work 


Because Fo and F} are defined separately from F,, for n = 2, we check the 
formula for these cases separately. For n = 2, the definition of F,, suggests 
that we should use the assumption that the formula is correct for F,» and 
F,,_, to prove that it is correct for F,,. Because we need to know that the 


formula works for two previous cases, we must use strong induction rather 
than ordinary induction. The rest of the proof is straightforward, although 
the algebra gets a little messy. 


Proof. We use strong induction. Let n be an arbitrary natural number, and 
suppose that for all k < n, 
(148) — (£) 


J5 


Fg = 


Case 1.n = 0. Then 


(ES - (98) _ (98) - (54) 


Case 2.n = 1. Then 


(Y — (58) _ (8) - (54) 


Case 3. n > 2. Then applying the inductive hypothesis to n — 2 and n - 1, 
we get 


Gol E Tai Go E (54) 
” : , - 
eate] H] 
i B 
1-J/5 


Now note that 


2 
(44) _142V54+5 64205 34-5 1+ V5 
a ae $ 


2 


and similarly 


Substituting into the formula for F,,, we get 
(AJ (8) - (4 (58) 


(£y - (8) 
VS 


Fp = 


Notice that in the proof of Theorem 6.4.3 we had to treat the cases n = 0 
and n = 1 separately. The role that these cases play in the proof is similar to 
the role played by the base case in a proof by ordinary mathematical 
induction. Although we have said that proofs by strong induction don’t need 
base cases, it is not uncommon to find some initial cases treated separately 


in such proofs. 


An important property of the natural numbers that is related to 
mathematical induction is the fact that every nonempty set of natural 
numbers has a smallest element. This is sometimes called the well-ordering 
principle, and we can prove it using strong induction. 


Theorem 6.4.4. (Well-ordering principle) Every nonempty set of natural 
numbers has a smallest element. 


Scratch work 


Our goal is VS € N(S # © > S has a smallest element). After letting S be 
an arbitrary subset of N, we’ll prove the contrapositive of the conditional 


statement. In other words, we will assume that S has no smallest element 
and prove that S = ©. The way induction comes into it is that, for a set S S 
N, to say that S = © is the same as saying that Vn E N(n/€ S). We’ll prove 


this last statement by strong induction. 


Proof. Suppose S € N, and S does not have a smallest element. We will 
prove that Vn E N(n /E S), so S = ©. Thus, if S # Ø then S must have a 
smallest element. 


To prove that Vn E N(n/€ S), we use strong induction. Suppose that n 
E N and Vk < n(k /E S). Clearly if n E S then n would be the smallest 


element of S, and this would contradict the assumption that S has no 
smallest element. Therefore n/€ S. 
CI 


Sometimes, proofs that could be done by induction are written instead as 
applications of the well-ordering principle. As an example of the use of the 
well-ordering principle in a proof, we present a proof that ,/2 is irrational. 
See exercise 2 for an alternative approach to this proof using strong 
induction. 


Theorem 6.4.5. ,/2 is irrational. 


Scratch work 


Because irrational means “not rational,” our goal is a negative statement, so 
proof by contradiction is a logical method to use. Thus, we assume ,/2 is 
rational and try to reach a contradiction. The assumption that ,/2 is rational 
means that there exist integers p and q such that p/g = ./2, and since ,/2 is 
positive, we may as well restrict our attention to positive p and q. Because 
this is an existential statement, our next step should probably be to let p and 
q stand for positive integers such that p/g = /2. As you will see in the 
proof, simple algebraic manipulations with the equation p/g = /2 do not 
lead to any obvious contradictions, but they do lead to the conclusion that p 
and q must both be even. Thus, in the fraction p/q we can cancel a 2 from 
both numerator and denominator, getting a new fraction with smaller 
numerator and denominator that is equal to ,/2. 

How can we derive a contradiction from this conclusion? The key idea is 
to note that our reasoning would apply to any fraction that is equal to ,/2, 
Thus, in any such fraction we can cancel a factor of 2 from numerator and 
denominator, and therefore there can be no smallest possible numerator or 
denominator for such a fraction. But this would violate the well-ordering 
principle! Thus, we have our contradiction. 

This idea is spelled out more carefully in the following proof, in which 
we’ve applied the well-ordering principle to the set of all possible 
denominators of fractions equal to „2. We have chosen to put this 
application of the well-ordering principle at the beginning of the proof, 
because this seems to give the shortest and most direct proof. Readers of the 
proof might be puzzled at first about why we’re using the well-ordering 
principle (unless they’ve read this scratch work!), but after the algebraic 
manipulations with the equation p/q = ,/2 are completed, the contradiction 
appears almost immediately. This is a good example of how a clever, 
carefully planned step early in a proof can lead to a wonderful punch line at 
the end of the proof. 


Proof. Suppose that ,/2 is rational. This means that 3g € Z*3p € Z*(p/q = 
V2), so the set $ = {q € Z* | 3p € Z*(p/q = V2)} is nonempty. By the 
well-ordering principle we can let q be the smallest element of S. Since q © 
S, we can choose some p E Z* such that p/q = 2. Therefore p2/q? = 2, so 


p? = 2q? and therefore p° is even. We now apply the theorem from Example 
3.4.3, which says that for any integer x, x is even iff x? is even. Since p? is 


even, p must be even, so we can choose some pP e€ Z* such that p = 2p. 
Therefore p? = 4p’, and substituting this into the equation p° = 2q? we get 
4p = 2q?, so 2p* = q? and therefore q? is even. Appealing to Example 
3.4.3 again, this means q must be even, so we can choose some g e Z+ such 
that q4 = 2g. But then /2 = p/q = (2p)/(29) = D/g.sog e S. Clearly 
q <q, so this contradicts the fact that q was chosen to be the smallest 
element of S. Therefore ,/2 is irrational. 


O 


Exercises 


*1, This exercise gives an alternative way to justify the method of strong 
induction. All variables in this exercise range over N. Suppose P(n) is a 
statement about a natural number n, and suppose that, following the 
strong induction strategy, we have proven that Vn[(Vk < n P (k)) > 
P(n)]. Let Q(n) be the statement Wk < nP(k). 


(a) Prove VnQ(n) - VnP (n) without using induction. 
(b) Prove VnQ(n) by ordinary induction. Thus, by part (a), VnP (n) is true. 


2. Rewrite the proof of Theorem 6.4.5 as a proof by strong induction that 
Yq e Niq > 0 > -Sp € Zt (p/q = V2)). 

3. In this exercise you will give another proof that ,/2 is irrational. 
Suppose ,/2 is rational. As in the proof of Theorem 6.4.5, let 
S = {q € Z* | Sp € Z*(p/q = V2)} # Ø, let q be the smallest element 
of S, and let p be a positive integer such that p/q = /2. Now get a 
contradiction by showing that p- q E S and p - q <q. 

*4. (a) Prove that \/6 is irrational. 

(b) Prove that ,/2 + ,/3 is irrational. 


5. The Martian monetary system uses colored beads instead of coins. A 
blue bead is worth 3 Martian credits, and a red bead is worth 7 Martian 
credits. Thus, three blue beads are worth 9 credits, and a blue and red 
bead together are worth 10 credits, but no combination of blue and red 
beads is worth 11 credits. Prove that for all n > 12, there is some 
combination of blue and red beads that is worth n credits. 


(e) 


ou: 


Suppose that x is a real number, x # 0, and x + 1/x is an integer. Prove 
that for all n > 1, x” + 1/x” is an integer. 

Let F,, be the nth Fibonacci number. All variables in this exercise range 
over N. 

Prove that for all n, 3°" 9 Fi = Faso — 1. 

Prove that for all n, X} (Fi)? = Fa Fn4. 

Prove that for all n, Y} o Foi41 = Fon+2. 

Find a formula for }~"_, Fzi and prove that your formula is correct. 

Let F, be the nth Fibonacci number. All variables in this exercise range 
over N. 


Prove that for all m > 1 and all n, F m+n = Fm-1 Fn + Em Frit: 

Prove that for all m > 1 and all n > 1, F4, = Fm+1 Fn+1 ~ Fm-1 Fn-1: 
Prove that for all n, (F,,)* +(Finv1)* = Fon+1 and (Fpa)? (Fp)? = Fonds 
Prove that for all m and n, if m | n then F,, | F,. 


See exercise 18 in Section 6.3 for the meaning of the notation used in 
this exercise. Prove that for all n > 1, 


F =(")*)+ 2n — 3 P my) + +("~ 1) 
a= 0 l 2 n—| 
“T(r? 

he i 


i=0 


F 2n — | 4 mt) 4 2n —3 4 H(i” ) 
—_— re 
E 0 | 2 n—1 

> © ~i- ) 

=, 

i=0 
A sequence of numbers dp, 44, A>, . . . is called a generalized Fibonacci 
sequence, or a Gibonacci sequence for short, if for every n = 2, a, = 
a„-2 + G,-;. Thus, a Gibonacci sequence satisfies the same recurrence 
relation as the Fibonacci numbers, but it may start out differently. 


and 


(a) 
(b) 


(c) 


10. 


11. 


12. 


Suppose c is a real number and Vn € N(a, = c”). Prove that do, a4, a», . 


. . is a Gibonacci sequence iff either c = (1 + /5)/2 ore = (1 — ~/5)/2. 
Suppose s and t are real numbers, and for all n € N, 


1+/75 n ER n 
An =S = + f > s 


Prove that dp, dy, Q», . . . is a Gibonacci sequence. 
Suppose dp, dj, A, . . . is a Gibonacci sequence. Prove that there are 


real numbers s and t such that forall n E N, 


( + s£) ( _ £) 
An = S = +t 5 ‘ 


(Hint: First show that there are real numbers s and t such that the 
formula above is correct for dg and a,. Then show that with this choice 


of s and t, the formula is correct for all n.) 


The Lucas numbers (named for the French mathematician Edouard 
Lucas (1842—1891)) are the numbers Lo, L4, Lo, . . . defined as follows: 


Lo = 2; 
Ly =l 


for every n > 2, L, = Ly-2 + Ln-1.- 


Find a formula for L, and prove that your formula is correct. (Hint: 
Apply exercise 9.) 
A sequence do, 44, A, . . . is defined recursively as follows: 
a= l; 
a; = 0; 
for every n > 2, an = Sayn—| — 6ay~2. 
Find a formula for a, and prove that your formula is correct. (Hint: 


Imitate exercise 9.) 
A sequence do, 44, do, . . . is defined recursively as follows: 


13. 


14. 
(a) 


(b) 


(c) 


15. 


: l 
for every n > 3, dn -> 5 an-3 + 5 n-2 + z ôn-l. 


- ae 


Prove that for all n E N, a, = F,,, the nth Fibonacci number. 

For each positive integer n, let A, = {1, 2,..., n}, and let P, = {X © 
P(A) | X does not contain two consecutive integers}. For example, P, 
= {Ø, {1}, {2}, {3}, {1, 3}}; P does not contain the sets {1, 2}, {2, 
3}, and {1, 2, 3} because each contains at least one pair of consecutive 
integers. Prove that for every n, the number of elements in P, is Fp+2, 
the (n + 2)th Fibonacci number. (For example, the number of elements 
in P} is 5 = F;. Hint: Which elements of P,, contain n? Which don’t? 
The answers to both questions are related to the elements of Pm for 
certain m < n.) 

Suppose n and m are integers and m > 0. 

Prove that there are integers q and r such that n = qm +r and0 <r< m. 
(Hint: If n > 0, then this follows from Theorem 6.4.1. If n < 0, then 
start by applying Theorem 6.4.1 to -n and m. Another possibility is to 
apply Theorem 6.4.1 to -n — 1 and m.) 

Prove that the integers q and r in part (a) are unique. In other words, 
show that if q' and r' are integers such that n = q' m + r' and O <r’ < m, 
then q = q' andr =r. 

Prove that for every integer n, exactly one of the following statements 
is true: n = 0 (mod 3), n = 1 (mod 3), n = 2 (mod 3). (Recall that this 
notation was introduced in Definition 4.5.9.) 

Suppose k is a positive integer. Prove that there is some positive integer 
a such that for all n > a, 2” > n*. (In the language of exercise 19 of 
Section 5.1, this implies that if f(n) = n* and g(n) = 2” then f E O(g). 
Hint: By the division algorithm, for any natural number n there are 
natural numbers q and r such that n = qk + rand 0 < r < k. Therefore 2” 
> 24k = (29)K, To choose a, figure out how large q has to be to guarantee 
that 24 > n. You may find Example 6.1.3 useful.) 


16. 


(b) 


17. 


18. 


19. 


(a) 


(a) Suppose k is a positive integer, a4, d>,... , a, are real numbers, 
and fi fo, -- -, fẹ and g are all functions from Z* to R. Also, 
suppose that f4, fo, - - - , fk are all elements of O(g). (See exercise 
19 of Section 5.1 for the meaning of the notation used here.) 
Define f: Z* > R by the formula Kn) = a, f, (n)+a; fo (n)+: + + +a, 
fx (n). Prove that f E O(g). (Hint: Use induction on k, and exercise 
19(c) of Section 5.1.) 

Let g: Z* > R be defined by the formula g(n) = 2”. Suppose dp, aj, a, 

..., Q% are real numbers, and define f: Z* — R by the formula f(n) = ag 

+a,n+ a, n? + -+ +a, nk. (Such a function is called a polynomial.) 

Prove that f E O(g). (Hint: Use exercise 15 and part (a).) 

A sequence do, 44, A», . . . is defined recursively as follows: 

ay = l; 
for every n € N, ay4) = 1+ Xai. 
i=0 
Find a formula for a, and prove that your formula is correct. 
A sequence do, 44, A>, . . . is defined recursively as follows: 


ajo = l; 


for every n € N, ay4) = 1 + —. 
an 
Find a formula for a, and prove that your formula is correct. (Hint: 
These numbers are related to the Fibonacci numbers.) 


In this problem, you will prove that there are no positive integers a, b, 
c, and d such that 


a +2b =c? and 28? +b =d?. (*) 
Prove that for all integers m and n, if 3 | (m? + n’) then 3 | m and 3 | n. 


(Hint: By exercise 14(c), either m = 0 (mod 3) or m = 1 (mod 3) or m = 
2 (mod 3), and also either n = 0 (mod 3) or n = 1 (mod 3) or n = 2 (mod 


(a) 


(b) 


(c) 


3). This gives nine possibilities. Determine which of these possibilities 
are compatible with the assumption that 3 | (m2 + n?).) 


Now suppose there are positive integers satisfying (*). Let 
S={d € Z*|3a € Z*3b e Zt3c e Z+ (a*4+2b* = c? na? +b? = d?)}. 


Then S # ©, so by the well-ordering principle we can let d be the 
smallest element of S. Let a, b, and c be positive integers satisfying 
(*). 

Prove that 3 | c and 3 | d. (Hint: Add the two equations in (*) and then 
apply part (a).) 

Prove that 3 | a and 3 | b. (Hint: Add the two equations in (*) and then 
apply part (b).) 

Show that there is an element of S that is smaller than d, which 
contradicts our choice of d. (Hint: Combine parts (b) and (c).) 


The number (1 + ./5)/2 that appears in the formula for the Fibonacci 
numbers in Theorem 6.4.3 is called the golden ratio. It is usually 
denoted p, and it comes up in numerous contexts in mathematics, art, 
and the natural world. In this exercise you will investigate a few of the 
mathematical contexts in which @ arises. 


In Figure 6.14, AEFD is a square. Show that if the ratio of the length of 
the longer side of rectangle BCFE to its shorter side is the same as the 
ratio of the length of the longer side of rectangle ABCD to its shorter 
side, then that ratio is @. 

Show that cos(36°) = @/2. (Hint: Let x = cos(36°). First show that 
cos(108°) = —cos(72°). Then use trigonometric identities to express 
cos(108°) and cos(72°) in terms of x. Substitute into the equation 
cos(108°) = —-cos(72°) to get an equation involving x and then solve the 
equation.) 

In Figure 6.15, ABCDE is a regular pentagon with side length 1. Show 
that the length of the diagonal AC is . (Hint: First find the angles in 
triangle ABC; you may find Example 6.2.3 helpful for this. Then use 
part (b).) 


21. 


(a) 
(b) 


D F C 
Figure 6.14. 
C 
B D 
l 
A E 
Figure 6.15. 


The commutative law for multiplication says that for any numbers a 
and b, ab = ba. The associative law says that for any numbers a, b, and 
c, (ab)c = a(bc). In this problem you will show that, although these 
laws are stated for products of two or three numbers, they can be used 


to justify reordering and regrouping the terms in a product of any list of 
numbers in any way. 


Use the commutative and associative laws to show that for any 
numbers a, b, c, and d, (ab)(cd) = c((ad)b). 

Let us say that the left-grouped product of a list of numbers aj, av, ... 
a, is the product in which the terms are grouped as follows: 


(+++ (((@1a2)a3)ag) +++ Ayn—1)an. 


More precisely, we can define the left-grouped product recursively as 
follows: For a list consisting of a single number a4, the left-grouped 


product is a4. If the left-grouped product of a4, av, ..., a, is p, then the 
left-grouped product of aj, dy, . . . , Gy, Gn, iS pa,.,. Use the 
associative law to show that any product of a list of numbers aj, a>, ... 
, a, (with the terms in that order, but with parentheses inserted to group 
the terms in any way) is equal to the left-grouped product. 

(c) Use the commutative and associative laws to show that any two 
products of the numbers q4, a», . . . , Ap, with the terms in any order and 
grouped in any way, are equal. 


6.5. Closures Again 


In Section 5.4 we promised to use mathematical induction to give an 
alternative treatment of closures of sets under functions. In this section we 
fulfill this promise. 

Recall that if f. A > A and B C A, then the closure of B under f is the 
smallest set C © A such that B € C and C is closed under f. In this section 
we'll find this set C by starting with B and then adding only those elements 
of A that must be added if we want to end up with a set that is closed under 
f. We begin with a sketchy description of how we’ll do this, motivated by 
the examples in Section 5.4. Then we’ll use recursion and induction to 
make this sketchy idea precise and prove that it works. 

As we saw in the examples in Section 5.4, if we want to find a set CG A 
such that B & C and C is closed under f, then for every x E B, we must 
have f(x) E C. In other words, {f(x) | x E B} © C. Recall from Section 5.5 
that {f(x) | x E B} is called the image of B under f, and is denoted f(B). So 
we will need to have f(B) © C. But then similar reasoning implies that the 
image of f(B) under f must also be a subset of C; in other words, f(f(B)) € 
G; 

Continuing in this way leads to a sequence of sets that must be contained 
in C: B, KB), f(f(B)), and so on. We will prove that putting these sets 
together by taking their union will give us the closure of B under f. In other 
words, if we let Bọ = B, B, = f(B), B» = f(f(B)), . . . , then the closure of B 


under f is Bọ U B4 U B, U -- -. The use of ellipses in our description of this 
process suggests that to make it precise, we should use induction and 


recursion. This is what we do in the statement and proof of our next 
theorem. 


Theorem 6.5.1. Suppose f: A > A and B CA. Let the sets Bp, B,, By,... 
be defined recursively as follows: 


Bo = B: 
for alin € N, Bn+t = FCB, ). 


Then the closure of B under f is the set U eyn Br- 


Proof. Let C = (J en Bn. Since f: A > A, it is not hard to see that each set 
B,, is a subset of A, and therefore C © A. According to the definition of 
closure, we must check that B € C, C is closed under f, and for every set D 
C A, if B S D and D is closed under f then C © D. 

The first of these holds because B = Bo S (pen Bn = C. For the second, 
suppose that x E C. Then by the definition of C, we can choose some m E 
N such that x E B. But then f(x) E f(B,,) = Bm1 so FŒ) € Unen Bn =C. 
Since x was an arbitrary element of C, this shows that C is closed under f. 

Finally, suppose that B © D € A and D is closed under f. We must show 
that C © D, and by the definition of C it suffices to show that Vn E N(B, S 
D). We prove this by induction on n. 

The base case holds because we have By = B & D by assumption. For the 
induction step, suppose that n E N and B, € D. Now suppose x E B,,,,. By 
the definition of B,,, this means x € f(B,), so there is some b E B,, such 
that x = f(b). But by the inductive hypothesis, B, © D, so b E D, and since 
D is closed under f it follows that x = f(b) © D. Since x was an arbitrary 
element of B,,,,, this shows that B,.,; S D. 

L 


Commentary. Because the proof must refer to the set pew Bn often, it is 
convenient to give this set the name C right at the beginning of the proof. 
The proof claims that it is not hard to see that for every n € N, B, S A, and 


therefore C € A. As usual, if you don’t see why this is true you should 
work out the details of the proof yourself. (You might try proving Vn € 


N(B, S A) by mathematical induction.) The definition of closure then tells 


us that we must prove three statements: B € C, C is closed under f, and for 
all D € A, if B © D and D is closed under f then C € D. Of course, we 
prove them one at a time. 

The proof of the first of these statements, B € C, is also not worked out 
in detail. If you have trouble following it, see exercise 8 in Section 3.3. The 
second statement we must prove says that C is closed under f, and the proof 
is based on the definition of closed: we let x be arbitrary, assume x E C, and 
prove that f(x) © C. According to the definition of C, the statement x E C 
means dn E N(x E B,), so we immediately introduce the variable m to 
stand for a natural number such that x E B,,. The goal f(x) E C is also an 
existential statement, so to prove it we must find a natural number k such 
that f(x) © B,. The proof shows that k = m + 1 works. 


Finally, to prove the third statement we use the natural strategy of letting 
D be an arbitrary set, assuming B © D © A and D is closed under f, and 
then proving that C © D. Once again, if you don’t see why the conclusion C 
S D follows from Vn E N(B, S D), as claimed in the proof, you should 
work out the details of the proof yourself. This last statement is proven by 
induction, as you might expect based on the recursive nature of the 
definition of B,. For the induction step, we let n be an arbitrary natural 
number, assume that B, © D, and prove that B,,, S D. To prove that Bp+1 
© D we take an arbitrary element of B,,,; and prove that it must be an 
element of D. Writing out the recursive definition of B,., gives us a way to 
use the inductive hypothesis, which, as usual, is the key to completing the 
induction step. 


We end this chapter by returning once again to one of the proofs in the 
introduction. Recall that in our first proof in the introduction we used the 
formula 


(2° — 1). (1 +2 p2 4... 207D _. 906 _ l. 


We discussed this proof again in Section 3.7 and promised to give a more 
careful proof of this formula after we had discussed mathematical 


induction. We are ready now to give this more careful proof. Of course, we 
can also state the formula more precisely now, using summation notation. 
Theorem 6.5.2. For all positive integers a and b, 


a—l| 


(2° = [y > 2 m — gab 


k=0 


Proof. We let b be an arbitrary positive integer and then proceed by 
induction on a. 


Base case: When a = 1 we have 


a—l 


b1). > 2 am ( (2? 1) 52 


k=0 k=0 
= (2 — 1). 1 
= qab — |] 
Induction step: Suppose a > 1 and (2? — 1) - 542} 2» = 2% — |. Then 
a a—l 
(2? s 1) E by -_ (2° _ 1) : £” + z) 
k=0 k=0 
a-l 
= (25 — 1) , pg re ab , gab _ gab 
k=0 
= 297 — 1 + 22t _ 206 (inductive hypothesis) 
=m nla+1)b — l 
O 
Exercises 


*1. Letf:R > R be defined by the formula f(x) = x+1, and let B = {0}. We 
saw in part 2 of Example 5.4.4 that the closure of B under f is N. What 
are the sets Bo, B4, B», . . . defined in Theorem 6.5.1? 


*4, 


rD: 


Let f: R — R be defined by the formula f(x) = x — 1, and let B = N. We 
saw after Example 5.4.2 that the closure of B under f is Z. What are the 
sets Bo, B4, Bo, . . . defined in Theorem 6.5.1? 
Suppose F is a set of functions from A to A and B C A. In exercise 12 
of Section 5.4 we defined the closure of B under F to be the smallest set 
C © A such that B € C and for every f E F, C is closed under f. Let the 
sets Bo, B4, Bo, . . . be defined recursively as follows: 

Bo = B; 

for all n € N, Baa = U f(Bn)- 

SJEF 

Prove that Uen Bn is the closure of B under F. 
For each natural number n, let f: IAN) > P(N) be defined by the 
formula f, (X) =X U {n}, and let F= {f | n E N}. Let B = {Ø}. In part 


(b) of exercise 12 in Section 5.4 you showed that the closure of B under 
F is the set of all finite subsets of N. What are the sets Bọ, By, B>, . . . 


defined in exercise 3? 
Let f: N x N > N be defined by the formula f(x, y) = xy. Let P be the 


set of all prime numbers. What is the closure of P under f? 


Consider the following incorrect theorem: 


Incorrect Theorem. Suppose f. A x A > A and B & A. Let the sets Bo, 
B,, B», . . . be defined recursively as follows: 


Bo =p: 
foralln € N, Bua = f (Bn x Bn). 


Then the closure of B under f is the set „en Bn- 
What’s wrong with the following proof of the theorem? 


Proof. LetC = | en Bn. It is not hard to see that each set B, is a subset 
of A, so C GA, and B = Bọ E C. 


a, 


10. 


To see that C is closed under f, suppose x, y © C. Then by the 
definition of C, there is some m E N such that x, y E B,,. Therefore 


f(x,y) € f(Bm X Bm) = Bm+1, $0 f(x, y) € Unen Bn = C. 

Finally, suppose B © D © A and D is closed under f. To prove that C 
© D, it will suffice to prove that Vn E N(B, S D). We prove this by 
induction. The base case holds because By = B & D by assumption. For 
the induction step, suppose B,, © D and let x E B,,,, be arbitrary. By the 
definition of B,,,, this means that x = f(a, b) for some a, b E B,. By the 
inductive hypothesis, B, © D, so a, b E D, and since D is closed under 
f, it follows that x = f(a, b) E D. Therefore B,,, S D. 

L 
Let f: R x R > R be defined by the formula f(x, y) = xy, and let B = {x 
E R | -2 < x < 0}. In this problem you will show that f and B are a 
counterexample to the incorrect theorem in exercise 6. 
What are the sets Bọ, B4, B», . . . defined in the incorrect theorem? 
Show that „ew Bn is not the closure of B under f. Which of the three 
properties in the definition of closure (Definition 5.4.8) does not hold? 
What is the closure of B under f? 
Suppose f: A x A > A and B CA. Let the sets Bo, B4, B», . . . be defined 
recursively as follows: 


Bo = B; 
forall n € N, Bn+1 = Bn U f (Bn X Bn). 


Prove that for all natural numbers m and n, if m < n then B,, © B,. 


(Hint: Let m be arbitrary and then use induction on n.) 
Prove that |) „ew Bn is the closure of B under f. 


Suppose f: A > A and f is a constant function; in other words, there is 
some c € A such that for all x € A, f(x) = c. Suppose B € A. What are 
the sets Bo, B4, B», . . . defined in Theorem 6.5.1? What is the closure of 
B under f? 

There is another proof in the introduction that could be written more 
rigorously using induction. Recall that in the proof of Theorem 4 in the 


introduction we used the fact that if n is a positive integer, x = (n+1)! 
+2, and 0 <i<n-1, then (i +2) | (x +i). Use induction to prove this. (We 
used this fact to show that x + i is not prime.) 


The remaining exercises in this section will use the following definition. 
Suppose R S A x A. Let R!, R?, R, . . . be defined recursively as follows: 


R! = R; 


forall n e Z+, R+! = R” oR. 


Clearly for every positive integer n, R” is a relation on A. 


11. 


12. 
(a) 


(b) 
13. 
(a) 
(b) 


14. 


15. 


16. 
(a) 


(b) 


Suppose R S A x A. Prove that for all positive integers m and n, R”*” = 
R” o R”. 

Suppose f: A > A. 

Prove that for every positive integer n, f": A > A. 

Suppose B & A, and let the sets Bọ, B4, B>, . . . be defined as in 


Theorem 6.5.1. Prove that for every positive integer n, f” (B) = Bp. 


Suppose f: A > A and a E A. We say that a is a periodic point for f if 
there is some positive integer n such that f" (a) = a. 

Show that if a is a periodic point for f then the closure of {a} under f is 
a finite set. 

Suppose the closure of {a} under f is a finite set. Must a be a periodic 
point for f? 

Suppose R © A x A and let T = | ez+ R”. Prove that T is the 


transitive closure of R. (See exercise 25 of Section 4.4 for the definition 
of transitive closure.) 


Suppose R and S are relations on A and R C S. Prove that for every 
positive integer n, R” € S”. 
Suppose R and S are relations on A and n is a positive integer. 
What is the relationship between R” NS” and (R nS)"? Justify your 
conclusions with proofs or counterexamples. 


What is the relationship between R” US" and (R US)"? Justify your 
conclusions with proofs or counterexamples. 


17. 


(a) 
(b) 


18. 


(a) 
(b) 


19. 


(a) 
(b) 


20. 


Suppose R is a relation on A and T is the transitive closure of R. If (a, b) 
€ T, then by exercise 14 there is some positive integer n such that (a, 
b) € R”, and therefore by the well-ordering principle (Theorem 6.4.4), 
there must be a smallest such n. We define the distance from a to b to 
be the smallest positive integer n such that (a, b) © R”, and we write 
d(a, b) to denote this distance. 


Suppose that (a, b) E T and (b, c) E T (and therefore (a, c) E T, since 
T is transitive). Prove that d(a, c) < d(a, b) + d(b, c). 

Suppose (a, c) E T and 0 < m < d(a, c). Prove that there is some b E A 
such that d(a, b) = m and d(b, c) = d(a, c) — m. 

Suppose R is a relation on A. For each positive integer n, let J, = {0, 1, 
2,...,n}. Ifa E A and b E A, we will say that a function f: J, > A is 
an R-path from a to b of length n if f(0) = a, f(n) = b, and for all i < n, (f 
(i), fit 1) ER. 

Prove that for all n E Z*, R" = {(a, b) E A x A | there is an R-path from 
a to b of length n}. 


Prove that the transitive closure of R is {(a, b) E A x A | there is an R- 
path from a to b (of any length) }. 


Suppose R is a relation on A. In this problem we find a relationship 
between distance, as defined in exercise 17, and R-paths, which were 
discussed in exercise 18. 


Suppose d(a, b) = n and a ~ b. Prove that if f is an R-path from a to b of 
length n, then f is one-to-one. 

Suppose d(a, a) = n. Prove that if fis an R-path from a to a of length n, 
then Vi < nVj < n(f(@) = fG) > i= j). (In other words, f is one-to-one, 
except for the fact that f(0) = f(n) = a.) 

Suppose R is a relation on A, T is the transitive closure of R, and A has 
m elements. Prove that 


T = RUR*U---UR"=|J{R"| 1 <n <m}. 
(Hint: Use exercise 19.) 


1 The terminology here is somewhat unfortunate, since what we are calling the division 
algorithm is actually a theorem and not an algorithm. Nevertheless, this terminology is 
common. 
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Number Theory 


7.1. Greatest Common Divisors 


In this chapter we will give an introduction to number theory: the study of 
the positive integers 1, 2, 3, .... It may seem that these numbers are so 
easy to understand that investigating them will not lead to any interesting 
discoveries. But we will see in this chapter that simple questions about the 
positive integers can be surprisingly difficult to resolve, and the answers 
sometimes reveal subtle and unexpected patterns. Of course, the only way 
to be sure of the answers to our questions will be to give proofs, using the 
methods we have developed in earlier chapters of this book. By now, you 
should be fairly proficient at reading and writing proofs, so we’ll give less 
discussion of the strategy behind proofs and leave more proofs as exercises. 

We begin with a concept that is fundamental to all of number theory, the 
greatest common divisor of a pair of positive integers. 


Definition 7.1.1. Suppose a is a positive integer. The divisors of a are the 
positive integers that divide a. We will denote the set of divisors of a by 
D(a). Thus, 


D(a) = {d € Z* | d divides a} = {d € Z* | Ik € Z(a = kd)}. 


If a and b are two positive integers, then D(a) n D(b) is the set of positive 
integers that divide both a and b — the common divisors of a and b. The 
largest element of this set is called the greatest common divisor of a and b, 
and is denoted gcd(a, b). 


For example, D(18) = {1, 2, 3, 6, 9, 18} and D(12) = {1, 2, 3, 4, 6, 12}, 
so the set of common divisors of 18 and 12 is D(18) n D(12) = {1, 2, 3, 6}. 
The largest of these common divisors is 6, so gcd(18, 12) = 6. 


Notice that 1 and a are always elements of D(a), and D(a) is a finite set, 
since D(a) © {1, 2,..., a}. Thus, for any two positive integers a and b, 
D(a)n D(b) is a finite set that is nonempty (since it contains 1), so it has a 
largest element (see exercise 3 in Section 6.2). In other words, gcd(a, b) is 
always defined. 

Given two positive integers a and b, how can we compute gcd(a, b)? One 
possibility is to start by listing all elements of D(a) and D(b), as we did 
when we computed gcd(18, 12). But if a and b are large then this may be 
impractical. Fortunately, there is a better way. 

Since D(a) n D(b) = D(b) n D(a), gcd(a, b) = gcd(b, a). In other words, 
in our notation for the greatest common divisor of two positive integers, it 
doesn’t matter which integer we list first. We will often find it convenient to 
list the larger integer first; in particular, when computing gcd(a, b), we will 
assume that a = b. 

One helpful observation is that if b | a then gcd(a, b) = b. This is because 
b is the largest element of D(b). If b | a then b is also an element of D(a), so 
it must be the largest element of D(a) n D(b). This suggests that to compute 
gcd(a, b), where a = b, we could start by dividing a by b. According to the 
division algorithm (Theorem 6.4.1), if we divide a by b we will find natural 
numbers q and r (the quotient and remainder) such that a = qb+r and r < b. 
If r = 0, then a = qb, so b | a and therefore gcd(a, b) = b. 

But what if r > 0? How can we compute gcd(a, b) in that case? We claim 
that in that case, D(a) n D(b) = D(b) n D(r). Let’s prove this fact. Suppose 
first that d E D(a) n D(b). Then d | a and d | b, so there are integers j and k 
such that a = jd and b = kd. But then from the equation a = qb + r we get r = 
a —qb = jd -qkd = (j -qk)d, so d | r. Therefore d E D(r), and since we also 
have d E D(b), d E D(b) n D(r). A similar argument shows that if d © 
D(b)nD(r) then d E D(a)nD(b), so D(a)nD(b) = D(b)nD(r). By the 
definition of greatest common divisor, it follows that gcd(a, b) = gcd(b, r). 

Let’s summarize what we’ve learned with a theorem. 


Theorem 7.1.2. Suppose a and b are positive integers with a = b. Let r be 
the remainder when we divide a by b. If r = 0 then gcd(a, b) = b, and ifr > 0 
then gcd(a, b) = gcd(b, r). 


Now, if r > 0, does this theorem help us to compute gcd(a, b)? One 
reason to think it might is that b < a and r < b, so it is probably easier to 


compute gcd(b, r) than gcd(a, b). Thus, the theorem allows us to replace our 
original problem of computing gcd(a, b) with the potentially easier problem 
of computing gcd(b, r). 

This should remind you of our study of recursion in Chapter 6. A 
recursive definition of a function f with domain Z* gives us a method of 
finding f(n) by using the values of f(k) for k < n. By using this method 
repeatedly, we are able to compute f(n) for any n. Perhaps if we apply our 
division method repeatedly we will be able to compute gcd(a, b). 

Before working out this idea in general, let’s try it out in an example. 
Suppose we want to find gcd(672, 161). We begin by dividing a = 672 by b 
= 161, which gives us a quotient q = 4 and remainder r = 28: 


672 = 4-161 +28. 


By Theorem 7.1.2, we conclude that gcd(672, 161) = gcd(a, b) = gcd(b, r) = 
gcd(161, 28). So let’s try to compute gcd(161, 28), which seems like an 
easier problem. 

How do we solve this problem? By the same method, of course! We start 
by dividing 161 by 28, to get a quotient of 5 and remainder of 21: 


161 = 5.28 +21. 


Applying Theorem 7.1.2 again, we see that gcd(161, 28) = gcd(28, 21). To 
compute gcd(28, 21) we divide 28 by 21: 


28 = 1.21 +7. 


Thus gcd(28, 21) = gcd(21, 7). But 21 = 3 - 7 + 0, so 7 | 21 and therefore 
gcd(21, 7) = 7. We conclude that this is the answer to our original problem: 
gcd(672, 161) = 7. 

We can summarize our calculations with the following list of equations: 


672 = 4- 161 + 28, 
161 = 5.28 +21, 
28 = 1-21 +7, 
21l =3.7 +0. 


These calculations produce a decreasing list of natural numbers: 672, 161, 
28, 21, 7, 0. The first two numbers are our original positive integers a and 
b, and after that every number is the remainder when dividing the previous 
number into the one before that. The greatest common divisors of all 
adjacent pairs of positive integers in the list are the same. The calculation 
ended when we got a remainder of 0, and the last nonzero number in the list 
is 7 = gcd(21, 7) = gcd(672, 161). 

Now let’s generalize. Suppose we want to find gcd(a, b), where a and b 
are positive integers and a = b. We define a sequence of natural numbers ro, 
ry, >, . . . recursively as follows. To start off the sequence, we let rọ = a and 
rı = b; notice that rọ = r,;. Then we let q) and r, be the quotient and 


remainder when we divide rg by r,: 
ro = q2 -ri tro, QO<r2 <P}. 


If r # 0, then we divide r} by r, to get a quotient q, and remainder r}. In 
general, having computed rp, ry, ..., Fa ifr, Z 0 then we divide r,_, by rn 
to produce a quotient and remainder of q,,.; and r,+4: 


Fn=1 = n41 ln E n+l 0< rn+1 S Tn- 


The calculation stops when we reach a remainder of 0. 

Are we sure that we will eventually have a remainder of 0? Well, if we 
don’t, then the sequence of divisions will go on forever, and we will end up 
with an infinite sequence of positive integers rọ, r1, rə, . . . with rg = ry > r> 
>- ++, This is impossible, since {rọ, r4, ro, . . .} would be a nonempty set of 


natural numbers with no smallest element, contradicting the well-ordering 
principle (Theorem 6.4.4). Thus, we must eventually have a remainder of 0. 


Suppose m is the largest index for which rn # 0. Then r,,,, = 0, and there 
are m divisions, which can be summarized as follows: 
ro = q2 -ri +r, 


ri = q3: r +r3, 


'm—=1 = m41 + lm + 0. 


Applying Theorem 7.1.2 to each division, we conclude that 


gcd(a, b) = gcd(ro, r1) = gcd(r1, r2) = +++ = gcd(tm-1,.1m) = Fm. 


Thus, gcd(a, b) is the last nonzero value in the sequence ro, r4, ro, .. - 


This method of computing the greatest common divisor of two positive 
integers is called the Euclidean algorithm. It is named for Euclid, who 
described it in Book VII of his Elements. 


Example 7.1.3. Find the greatest common divisor of 444 and 1392. 
Solution 


We apply the Euclidean algorithm with a = 1392 and b = 444, The 
calculations are shown in Figure 7.1. Each equation in the column 
“Division” shows the division calculation that leads to the quotient and 
remainder in the next row. Since the last nonzero remainder is 12, we 
conclude that gcd(1392, 444) = 12. 


n Gn Fn Division 

0 1392 

l 444 1392 = 3 -444 + 60 
z 3 60 444 = 7 -60 +24 
3 7 24 60 = 2-24+ 12 
4 2 12 24=2-12+0 

3 2 0 


Figure 7.1. Calculation of gcd(1392, 444) by Euclidean algorithm. 


The inputs to the Euclidean algorithm in the last example were a = 1392 
and b = 444. It is instructive to see how the remainders we computed are 
related to these inputs. Rearranging the first equation in the “Division” 
column in Figure 7.1, we see that 


r2 = 60 = 1392 — 3 - 444 =a — 3b. 


Similarly, from the next equation we get 


r3 = 24 = 444 — 7 -60 = b — Tr = b — 7 (a — 3b) = —Ta + 22b, 
and the third equation gives us 
r4 = 12 = 60 — 2 -24 = r — 2r3 = (a — 3b) — 2(—7a + 22b) = 15a — 47b. 


We see that each remainder can be written in the form sa + tb, for some 
integers s and t. We say that each remainder is a linear combination of a 
and b. But the last nonzero remainder is the greatest common divisor of a 
and b, so we conclude that gcd(a, b) is a linear combination of a and b: 
gcd(a, b) = r4 = 15a — 47b. Working out this reasoning in general proves 


our next theorem. 


Theorem 7.1.4. For all positive integers a and b there are integers s and t 
such that gcd(a, b) = sa + tb. 


Proof. As usual, we may assume a = b; if not, we can simply reverse the 
values of a and b. Let ro, r4, . - . , Fm+1 be the sequence of numbers produced 


by the Euclidean algorithm, where rm # 0 and rm+1 = 0. We claim that for 
every natural number n < m, r, is a linear combination of a and b. In other 
words, for every natural number n, if n < m then there are integers s„ and t, 
such that r, = s, a + t, b. We prove this statement by strong induction. 


Suppose n is a natural number and n < m, and suppose also that for all k < 
n, rgis a linear combination of a and b. We now consider three cases. 


Case 1: n = 0. Then r, = ro = a= Sọ a + ty b, where sọ = 1 and ty = 0. 
Case 2: n = 1. Then r, =r, =b=s,a+t, b, wheres, = O and 4 = 1. 


Case 3: n 2 2. Then r, is the remainder when r,- is divided by r,-1: 
Fn=2 = Gn * Tn-1 E'n. 


By the inductive hypothesis, there are integers Sp-1, Sn-2; t,-1, and t,» such 
that 


Mn—-1 = Sn—14 + ty-1, Fn—2 = Sn—24 + tn—2b. 


Therefore 


Fn =Vn-2 — Gn ' Vn-1 = (Sn—-24 + tn—2b) — Gn(Sn-14 + tn—1b) 


= (Sn—-2 — Gn$n—1)4 + (tn-2 — Gntn-1)b, 


sor, = Sp A + t, b, where S, = Sp-2 — qn Sn-1 and tp = tp-2 T Qn th-1- 
This completes the inductive proof that for every n < m, r, is a linear 


combination of a and b. Applying this statement in the case n = m, we 
conclude that gcd(a, b) = r,, is a linear combination of a and b. 0 


For an alternative proof of Theorem 7.1.4, see exercise 4. One advantage 
of the proof we have given is that it provides us with a method to find 
integers s and t such that gcd(a, b) = sa + tb. While carrying out the 
Euclidean algorithm, we can compute numbers s„ and t, recursively by 


using the formulas: 
s% = 1, to = 0, 
s; = 0, t, = I, 


for n > 2, Sn = Sn-2 — GnSn—1; tn = tn—2 — Intn-1. 


If m is the largest index for which rm # 0, then gcd(a, b) = rm = Sm a + tm b. 
The version of the Euclidean algorithm in which we keep track of these 
extra numbers s,, and t, is called the extended Euclidean algorithm. 


Example 7.1.5. Use the extended Euclidean algorithm to find gcd(574, 
168) and express it as a linear combination of 574 and 168. 


Solution 


The calculations are shown in Figure 7.2. We conclude that gcd(574, 168) = 
14=5-574-17- 168. 


n Gn In Sa tn Division 
0 574 l 0 


l 168 0 1 574 =3. 168 +70 
2 3 70 1-3-0= | 0-3-1= —3 168=2-70+ 28 
3 2 28 O-2-l1=-2 1-2-(-3)= 7 W=2-28+14 
42 14 1-2-(—2)= 5 -3-2:7=-17 28=2-14+0 
5 2 0 


Figure 7.2. Calculation of gcd(574, 168) by extended Euclidean algorithm. 


As an immediate consequence of Theorem 7.1.4, we have the following 
surprising fact. 


Theorem 7.1.6. For all positive integers a, b, and d, if d | a and d | b then d 
| gcd(a, b). 


Proof. Let a, b, and d be arbitrary positive integers and suppose that d | a 
and d | b. Then there are integers j and k such that a = jd and b = kd. Now 
by Theorem 7.1.4 let s and t be integers such that gcd(a, b) = sa + tb. Then 


gcd(a, b) = sa + tb = sjd + tkd = (sj + tk)d, 


so d | gcd(a, b). 0 
Recall from part 3 of Example 4.4.3 that the divisibility relation is a 


partial order on Z*. We could interpret Theorem 7.1.6 as saying that gcd(a, 


b) is the largest element of D(a) n D(b) not only with respect to the usual 
ordering of the positive integers, but also with respect to the divisibility 
partial order. 


Exercises 


1. Leta = 57 and b = 36. 
(a) Find D(a), D(b), and D(a) n D(b). 
(b) Use the Euclidean algorithm to find gcd(a, b). 
*2. Find gcd(a, b), and express it as a linear combination of a and b. 


(a) 
(b) 


(b) 


a = 775, b = 682. 

a = 562, b = 243. 

Find gcd(a, b), and express it as a linear combination of a and b. 

a = 2790, b = 1206. 

a = 191, b = 156. 

Complete the following alternative proof of Theorem 7.1.4. Suppose a 

and b are positive integers. Let L = {n € Z* | ds € ZAt € Z(n = sa + 

tb)}. Show that L has a smallest element. Let d be the smallest element 

of L. Now show that d = gcd(a, b). (Hint: Show that when you divide 
either a or b by d, the remainder cannot be positive.) 

Suppose a and b are positive integers, and let d = gcd(a, b). Show that 

for every integer n, n is a linear combination of a and b iff d | n. 

Prove that for all positive integers a, b, and c, gcd(a, b) = gcd(a+bc, b). 

Suppose that a, a ', b, and b ' are positive integers. 

Ifa<a'andb <b ', must it be the case that gcd(a, b) < gcd(a ', b ')? 

Justify your answer with either a proof or a counterexample. 

Ifa |a'andb |b ', must it be the case that gcd(a, b) | gcd(a ', b ')? 

Justify your answer with either a proof or a counterexample. 

Prove that for every positive integer a, gcd(5a + 2, 13a + 5) = 1. 

Prove that for all positive integers a and b, gcd(2% - 1, 2? - 1) = 

2gcd(a,b) a, 

Prove that for all positive integers a, b, and n, gcd(na, nb) = n gcd(a, b). 

Suppose a, b, and c are positive integers. 

Prove that D(gcd(a, b)) = D(a) n D(b). 

Prove that gcd(gcd(a, b), c) is the largest element of D(a) n D(b) n 

D(c). 

(a) Use the Euclidean algorithm to find gcd(55, 34). Do you 
recognize the numbers in the sequence ro, ri, .. . ? (Hint: Look 
back at Section 6.4.) How many division steps are there? 

Suppose n = 2. What is gcd(F,,,,, F,,)? How many division steps are 

there when using the Euclidean algorithm to find gcd(F,.,, Fp)? (Fis 

the nth Fibonacci number.) 


13. 


(a) 
(b) 


(c) 


(d) 


14. 


(b) 
(c) 


(d) 


Suppose a and b are positive integers with a = b. Let rp, r4, .. - , Fm+1 De 


the sequence of numbers produced when using the Euclidean algorithm 
to compute gcd(a, b), where rm # 0 and rm+1 = 0. Note that this means 


that the algorithm required m divisions. 

Prove that Vk E N(k < m > rm-k = Fy), where F;,5 is the (k + 2)th 
Fibonacci number. 

Let ø= (1+45)/2. 6.4.) Prove that for every(ọ ispositivethe 


goldeninteger k, Fy > v*/./5—1. ratio; see exercise20 in Section(Hint: 
Use Theorem 6.4.3.) 
Show that 
log(b + 1) log 5 
~ logg 2logy 


(You can use either base-10 logarithms or natural logarithms in this 
formula.) 


Show that if b has at most 100 digits, then the number of divisions 
when using the Euclidean algorithm to compute gcd(a, b) will be at 
most 479. 
(a) Prove the following alternative version of the division algorithm: 
For any positive integers a and b, there are natural numbers q and 
r such that r < b/2 and either a = qb +r ora = qb -r. 
Suppose that a, b, and r are positive integers, q is a natural number, and 
either a = qb + r ora = qb - r. Prove that gcd(a, b) = gcd(b, r). 
Suppose a and b are positive integers with a => b. Define a sequence rọ, 
ri, - - . recursively as follows: rg = a, r4 = b, and for all n= 1, ifr, 4 0 
then we use part (a) to find natural numbers q,,,, and r,,,, such that r,., 
<r, /2 and either r,-1 = qn+1 Fn + Fn+1 OF Mn-1 = Inti Fn T Mn+ - Prove 
that there is some m such that r,, 4 0 and r,,,, = 0, and gcd(a, b) = rm 
This gives us a new method of computing greatest common divisors; it 
is called the least absolute remainder Euclidean algorithm. 
Compute gcd(1515, 555) by both the Euclidean algorithm and the least 
absolute remainder Euclidean algorithm. Which takes fewer steps? 


7.2. Prime Factorization 


In Section 6.4 we saw that every integer n > 1 is either prime or can be 
written as a product of prime numbers; we say that n has a prime 
factorization. In this section we will show that this prime factorization is in 
a certain sense unique. One important tool in this investigation will be 
greatest common divisors. In particular, we will be interested in pairs of 
positive integers whose greatest common divisor has the smallest possible 
value, 1. 


Definition 7.2.1. If a and b are positive integers and gcd(a, b) = 1, then we 
say that a and b are relatively prime. 


Equivalently, we can say that a and b are relatively prime if their only 
common divisor is 1. For example, D(50) = {1, 2, 5, 10, 25, 50} and D(63) 
= {1, 3, 7, 9, 21, 63}, so D(50) n D(63) = {1}. Therefore gcd(50, 63) = 1, 
so 50 and 63 are relatively prime. 

One reason relatively prime integers are important is given by our next 
theorem. The key to the proof of the theorem is the use of existential 
instantiation to introduce names for integers that we know exist. 


Theorem 7.2.2. For all positive integers a, b, and c, if c | ab and gcd(a, c) = 
1 then c | b. 


Proof. Suppose c | ab and gcd(a, c) = 1. Then there is some integer j such 
that ab = jc, and by Theorem 7.1.4, there are integers s and t such that sa + 
tc = 1. Therefore 


b=b-1=b-(sa+tc) =sab+the = sje + tbe = (sj + tb)c, 
soc|b.0 


Notice that if p is a prime number then D(p) = {1, p}. Thus, for any 
positive integer a, the only possible values of gcd(a, p) are 1 and p. If p | a 
then gcd(a, p) = p, and if not, then the only common divisor of a and p is 1 
and therefore a and p are relatively prime. Combining this observation with 
Theorem 7.2.2, we get the following important fact about prime divisors. 


Theorem 7.2.3. For all positive integers a, b, and p, if p is prime and p | ab 
then either p | a or p | b. 


Proof. Suppose p is prime and p | ab. As we observed earlier, if p 4 a then a 
and p are relatively prime, and therefore by Theorem 7.2.2, p | b. Thus, 
either p | a or p | b. O 


Commentary. Notice that to prove the disjunction (p | a) V (p | b), we used 
the strategy of assuming p + a and then proving p | b. 


Using mathematical induction, we can extend this theorem to the case of 
a prime number dividing a product of a list of positive integers. 


Theorem 7.2.4. Suppose p is a prime number and aj, d>,... , a, are 
positive integers. If p | (a, a °° + ap), then for some i E {1, 2,..., k}, p | a; 


Proof. We prove this theorem by induction on k. In other words, we will use 
induction to prove the following statement: for every k > 1, if p divides the 
product of any list of k positive integers, then it divides one of the integers 
in the list. 

Our base case is k = 1, and in that case the statement is clearly true: if p | 
a4, then there is some i E {1} such that p | a;, namely, i = 1. 


Now suppose the statement holds for any list of k positive integers, and 
let dy, a), . . . , Aķ+1 De a list of positive integers such that p | (a; a> * * © a; 
ag+1). Since Ay Ay ` ` * Ak Akri = (Gy dy ++ * A,)a;,41, by Theorem 7.2.3 either p 
| (a, dy +++ ag) or p | a,4;. In the first case, by the inductive hypothesis we 
have p | a; for some i E {1, 2,..., k}, and in the second we have p | a; 
where i=k+1.0 


We are now ready to address the issue of the uniqueness of prime 
factorizations. Consider, for example, the problem of writing 12 as a 
product of prime numbers. There are actually three different ways to write 
12 as a product of prime numbers: 12 = 2-2-3=2-3-2=3-2- 2. But of 
course in all three cases we are multiplying the same three prime numbers, 
just in a different order. To avoid counting these as three different prime 
factorizations of 12, we will only consider factorizations in which the 


primes are listed from smallest to largest. There is only one prime 

factorization of 12 that meets this additional requirement: 12 = 2-2-3. 
More generally, we will be interested in expressions of the form p; p> °°: 

Dj, where Pj, Po, . . . , Pg are prime numbers and p, < ps <--+: < Ppp We will 


say that such an expression is the product of a nondecreasing list of prime 
numbers. We will show that every integer larger than 1 can be written as the 
product of a nondecreasing list of prime numbers in a unique way. 


Recall that, to show that an object with some property is unique, we show 
that any two objects with the property would have to be equal. Thus, the 
key to proving the uniqueness of prime factorizations will be the following 
fact. 


Theorem 7.2.5. Suppose that p4, Po, . . - , Py and qj, qo, - - - , Im are prime 


numbers, Py < P? S*** < Pe 44 $2 S °° * S qm and Py Po *** Pk= 91 92°" 
qm Then k = mand for alli E {1,..., k}, pi = qi 


Proof. The proof will be by induction on k. In other words, we use 
induction to prove that for all k > 1, if the product of some nondecreasing 
list of k prime numbers is equal to the product of another nondecreasing list 
of prime numbers, then the two lists must be the same. 


When k = 1, we have p; = q1 q2 °° * qm If m > 1 then this contradicts the 
fact that p4 is prime. Therefore m = 1 and p, = q4. 

For the induction step, suppose the statement is true for products of 
nondecreasing lists of k prime numbers, and suppose that pj, Po, .- . , Pk+1 
and qj, qo, - - - 5 Gm are prime numbers, p4 < po <°°* < Pp 4, SQ: 
qm and Py Po ** * Peay = 91 2° * * dm Notice that if m = 1 then this equation 
says p4 Po °° * Pk+1 = qi, and as in the base case this contradicts the fact that 
qı is prime, so m > 1. 

Clearly py+1 | (P1 P2 ` © * Piri)» SO Pri | (q1 G2 ` * * Im), and by Theorem 
7.2.4 it follows that p,,, | q; for some i. Therefore p;,, < qi < qm. A similar 
argument shows that qm | pj for some j, SO qm < Pj < Px+1- We conclude that 
Pk+1 = qm Canceling these factors from the equation p; Po °° * Pry = q4 Q2’ 
** qm gives US Pi Po * * * Pe = Gy Go °° * Gm-1, and now the inductive 


hypothesis tells us that the remaining factors on both sides of the equation 
are the same, as required. O 


We now have in place everything we need to establish the existence and 
uniqueness of prime factorizations. This theorem is so important it is known 
as the fundamental theorem of arithmetic. 


Theorem 7.2.6. (Fundamental theorem of arithmetic) For every integer n>1 
there are unique prime numbers pj, P», . . . , Pp, such that p, < Po < +° * < Pk 


and n = P1 Py *** Pk 


Proof. By Theorem 6.4.2, every integer greater than 1 is either prime or a 
product of primes. Listing the primes from smallest to largest gives us the 
required nondecreasing prime factorization. Uniqueness of the factorization 
follows from Theorem 7.2.5.0 


If we write the product of the list of prime numbers p4, po, . . . , pg in the 
form 1 : p; Po ** * Pw then it is natural to introduce the convention that the 


product of the empty list is 1. With this convention we can extend the 
fundamental theorem of arithmetic to say that every positive integer has a 
unique prime factorization, where the factorization of the number 1 is the 
product of the empty list of prime numbers. 


Example 7.2.7. Find the prime factorizations of the following integers: 
275, 276, 277. 


Solution 


The most straightforward way to find the prime factorization of a positive 
integer is to search for its smallest prime divisor, factor it out, and repeat 
until all factors are prime. This gives the following results. (Note that 277 is 
prime, so the factoring process for 277 stops immediately.) 


275 =5. 


5.55 
276 = 2-138 = 2.2.69 =2.2.3. 23, 


When there are repeated primes in the prime factorization of an integer, 
we often use exponent notation to write the prime factorization. For 
example, the factorizations of 275 and 276 in the last example could be 
written in the form 275 = 5° - 11 and 276 = 2? - 3 - 23. More generally, we 
can write the prime factorization of a positive integer n in the form 
n = p\'ps’---p.‘, where pj, P», ..., Pp are prime numbers, p4 < pa <: +: < 
Pp and e}, @o,..., €g are positive integers. Again, by the fundamental 
theorem of arithmetic, this representation of n is unique. 

The fundamental theorem of arithmetic can provide insight into a number 
of concepts of number theory. For example, suppose n and d are positive 
integers and d | n. Then there is some positive integer c such that cd = n. 
Now let the prime factorizations of c and d be c = p4 p» °° + p, and d = q; qo 
*** dm Then n = cd = p; Po ¢ ** Pk qi Qo + + * dm If we rearrange the primes 
in this product into nondecreasing order, then this must be the unique prime 
factorization of n. Therefore d must be the product of some subcollection of 
the primes in the prime factorization of n. Notice that we are including here 
the possibility that the subcollection is the empty subcollection (so that d = 
1 and c = n) or that it includes all of the primes in the factorization of n (so 
that d = n and c = 1). 

Rephrasing this conclusion using exponent notation, suppose the prime 
factorization of n is n = p{' p9 --- py. Then the divisors of n are precisely 
the numbers of the form p p? = pi, where for all i E {1, 2,..., k}, O< 
fi < e; For example, we saw in Example 7.2.7 that the prime factorization of 


276 is 276 = 22 - 3 - 23. Therefore 
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D(276) = {1, 2, 27, 3,2 - 3,27 - 3,23, 2 - 23,27 . 23,3 - 23,2 - 3 - 23,2 


= {1,2,4, 3, 6, 12,23, 46, 92, 69, 138, 276}. 


-3 23) 


Prime factorization can also help us understand greatest common 
divisors. Suppose a and b are positive integers. Let p4, P>, ..., Pg be a list 


of all primes that occur in the prime factorization of either a or b. Then we 
can write a and b in the form 


ek 


SN ej e? } a fi h fk 
a= p] Pa *** PE > 2 = p] P3 ve Pps 


where some of the exponents e; and f; might be 0, since some primes might 


occur in only one of the factorizations. By the discussion of divisibility and 


prime factorization in the previous paragraph, the common divisors of a and 
b are all numbers of the form p{'p5--- pf. where for every i E {1,..., 
k}, g; < e; and g; < f;. The greatest common divisor can be found by letting 
each g; have the largest possible value, which is min(e; f) = the minimum 


of e; and f;. In other words, 


min(e, f min(e>, f2) nin(ex. fe) 
gcd(a, b) = Pi = fi) ye infer, j2 “++ Dy infek, fr) 


For example, in Example 7.1.3 we used the Euclidean algorithm to find 
that gcd(1392, 444) = 12. We could instead have factored 1392 and 444 into 
primes: 


N 
ca 


3! .29' . 37, 
3! 299 . 37". 


1392 = 24 . 3 -29 = 
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These factorizations give us another way to find the greatest common 
divisor of 1392 and 444: 


gcd( 1392, 444) — pmin(4,2) . 3min(1,1) . 2gmin(1,0) . 37min(0,1) 


= 2? . 3! . 290 . 379 = 12. 


Usually the Euclidean algorithm is a more efficient way to find the greatest 
common divisor of two positive integers than prime factorization. But if 
you happen to know the prime factorizations of two positive integers, then 
you can compute their greatest common divisor very easily. 

Another concept that is elucidated by prime factorization is least 
common multiples. For any positive integers a and b, the least common 
multiple of a and b, denoted lcm(a, b), is the smallest positive integer m 
such that a | m and b | m. Least common multiples come up when we are 
adding fractions: to add two fractions with denominators a and b, we start 
by rewriting them with the common denominator lcm(a, b). 

Suppose that, as before, 


a= pi py tee py : b= pi! py = př. 


For each i E {1,..., k}, any common multiple of a and b must include a 
factor p*' in its prime factorization, where g; > e; and g; > f;. The smallest 


possible value of g; is the maximum of e; and f, which we will denote 
max(e;, fi)» SO 


max(e; ff Max(e>, >) max(ez, ~ ) 
Icm(a,b) = Pi (er fi) > IX(€2.f2 cpi ek, fk i 


It is not hard to show that for any numbers e and f, min(e, f) + max(e, f) = e 
+ f (see exercise 4), so 


gcd(a, b) - Icm(a, b) = (pr rnm = pE nA) , (prmen oe aa ’) 
min(e, fi) +max(ei. fi) min(eg, f) +max (eg, fe) 
= a eee Pk 
etfi ek+ fk 
— Pi ae Pk 


e e f fi 
= (pi! --- pk) (py! +++ py’) = ab. 
This gives us another way to compute Icm(a, b): 


ab 
Iem(a,b) = rrr Ie ETA 
ocdla, 


For an alternative proof of this formula, see exercise 8. 
For example, we now have two ways to compute lcm(1392, 444): 


Iem( 1392, 444) = qmax(4,2) : Zmax( 1,1). 2gmax(1,0) i 37™max(0,1) 


= 24 .3! .29! . 37! = 51504, 


and 


1392 - 444 618048 


Icm( 1392, 444) = ——————__- = 
gcd(1392, 444) 12 


= 51504. 


This shows that if we want to add two fractions with denominators 1392 
and 444, we should use the common denominator 51504. 


Example 7.2.8. Find the least common multiple of 1386 and 1029. 


Solution 


We begin by using the Euclidean algorithm to find gcd(1386, 1029). The 
calculations in Figure 7.3 show that gcd(1386, 1029) = 21. Therefore 


1386 - 1029 1386 - 1029 


Iem(1386, 1029) = gcd(1386, 1029) ~ z] = 67914. 
n Gn Pa Division 
0 1386 
l 1029 1386 = 1. 1029 + 357 
2 | 357 1029 = 2-357 +315 
3 2 315 357 = 1-315 +42 
4 | 42 315 =7.42 +21 
5 7 21 42=2-21+0 
6 2 0 


Figure 7.3. Calculation of gcd(1386, 1029) by Euclidean algorithm. 


Alternatively, we could use prime factorizations: 1386 = 2 - 32-7-11 and 
1029 = 3- 7°, so Icm(1386, 1029) = 2- 3? - 73 - 11 = 67914. 


Exercises 
1. Find the prime factorizations of the following positive integers: 650, 
756, 1067. 
*2. Find lcm(1495, 650). 
3. Find Ilcm(1953, 868). 
4. Prove that for any numbers e and f, min(e, f) + max(e, f) =e + f- 
*5. Suppose a and b are positive integers. Prove that a and b are relatively 
prime iff their prime factorizations have no primes in common. 
6. Suppose a and b are positive integers. Prove that a and b are relatively 
prime iff there are integers s and t such that sa + tb = 1. 
7. Suppose a, b, a ', and b ' are positive integers, a and b are relatively 
prime, a'| a, and b' |b. Prove that a' and b' are relatively prime. 
*8. Suppose a and b are positive integers. In this exercise you will give an 


alternative proof of the formula Icm(a, b) = ab/gcd(a, b). Let m = 


(a) 


(b) 


15. 


(a) 
(b) 


Icm(a, b). 

Prove that ab/gcd(a, b) is an integer and that a | (ab/gcd(a, b)) and b | 
(ab/gcd(a, b)). Use this to conclude that m < ab/gcd(a, b). 

Let q and r be the quotient and remainder when ab is divided by m. 
Thus, ab = qm+rand0<r<m. 

Prove that r = 0. 

By part (b), ab = qm. Prove that q | a and q | b. 

Use part (c) to conclude that m = ab/ gcd(a, b). Together with part (a), 
this shows that m = ab/ gcd(a, b). 


Suppose a and b are positive integers, and let d = gcd(a, b). Then d | a 
and d | b, so there are positive integers j and k such that a = jd and b = 
kd. Prove that j and k are relatively prime. 

Prove that for all positive integers a, b, and d, if d| ab then there are 
positive integers d, and d, such that d = d} d>, d; | a, and d, | b. 

Prove that for all positive integers a, b, and m, if a | m and b | m then 
Icm(a, b) | m. 

Suppose a, b, and c are positive integers. Let m be the smallest positive 
integer such that a | m, b | m, and c | m. Prove that m = Icm(lcm(a, b), 
c). 

Prove that for all positive integers a and b, if a? | b? then a | b. 

(a) Find all prime numbers p such that 5p + 9 € {n? | n E N}. 


Find all prime numbers p such that 15p + 4 € {n? | n E N}. 
Find all prime numbers p such that 5p + 8 € {n3 | n € N}. 


Let H = {4n+1|n€ N} = {1, 5, 9, 13, . . .}. The elements of H are 


called Hilbert numbers (named for David Hilbert (1862-1943)). A 
Hilbert number that is larger than 1 and cannot be written as a product 
of two smaller Hilbert numbers is called a Hilbert prime. For example, 
9 is a Hilbert prime. (Of course, 9 is not a prime, since 9 = 3 - 3, but 3 / 
€ H.) 


Show that H is closed under multiplication; that is, Vx E H Vy E H(xy 
€ H). 

Show that every Hilbert number that is larger than 1 is either a Hilbert 
prime or a product of two or more Hilbert primes. 


(c) 


16. 


17. 


(a) 
(b) 


(c) 


18. 


(d) 


Show that 441 is a Hilbert number that can be written as a product of a 
nondecreasing list of Hilbert primes in two different ways. Thus, 
Hilbert prime factorization is not unique. 
Suppose a and b are positive integers. Prove that there are relatively 
prime positive integers c and d such that c | a, d | b, and cd = Icm(a, b). 
Suppose a, b, and c are positive integers. 
Prove that gcd(a, bc) | (gcd(a, b) - gcd(a, c)). 
Prove that lcm(gcd(a, b), gcd(a, c)) | gcd(a, bc). (Hint: Use exercise 
11.) 
Suppose that b and c are relatively prime. Prove that gcd(a, bc) = 
gcd(a, b) - gcd(a, c). 
Recall from exercise 5 in Section 6.2 that the numbers F,, = 2@” +1 are 
called Fermat numbers. Fermat showed that F,, is prime for 0 <n < 4, 
and Euler showed that F, is not prime. It is not known if there is any n 
> 4 for which F, is prime. In this exercise you will see one reason why 
one might be interested in prime numbers of this form. Show that if m 
is a positive integer and 2” +1 is prime, then m is a power of 2. (Hint: If 
m is not a power of 2, then m has an odd prime number p in its prime 
factorization. Thus there is a positive integer r such that m = pr. Now 
apply exercise 14 in Section 6.1 to conclude that (2" + 1) | (2+ 1).) 
Suppose x is a positive rational number. 
Prove that there are positive integers a and b such that x = a/b and 
gcd(a, b) = 1. 
Suppose a, b, c, and d are positive integers, x = a/b = c/d, and gcd(a, b) 
= gcd(c, d) = 1. Prove that a = c and b = d. 
Prove that there are prime numbers py, P», . . . , Pg and nonzero integers 
C1, €&2, . . . , €g Such that p; < py < : + * <p, and 

x= pi py — Pi. 
Note that some of the exponents e; may be negative. 


Prove that the representation of x in part (c) is unique. In other words, if 
Pi, Po, -- - , Px and qy, qo, . - - 5 Gm are prime numbers, e4, @5,..., €k 


and fi, fo, - - - , fn are nonzero integers, py < Po <°**<DPp, 41 <Q <: 
< dm and 


x= p\' ps ae py = qi) 42 da qf", 
then k = m and for alli E {1, 2,..., k}, p= qand e= f. 
20. Complete the following proof that „⁄2 is irrational: Suppose a/b = 
JZ. where a and b are positive integers. Then a* = 2b*. Now derive a 


contradiction by considering the exponent of 2 in the prime 
factorizations of a and b. 


7.3. Modular Arithmetic 


Suppose m is a positive integer. Recall from Definition 4.5.9 that for any 
integers a and b, we say that a is congruent to b modulo m if m | (a — b). We 
write a = b (mod m), or more briefly a =,, b, to indicate that a is congruent 


to b modulo m. We saw in Theorem 4.5.10 that =,, is an equivalence 
relation on Z. For any integer a, let [a]„ be the equivalence class of a with 
respect to the equivalence relation =,,. The set of all of these equivalence 
classes is denoted Z/=,,. Thus, 


laln = {bE Z| b=a (mod m)}, Z/=m = {laln | a € Z}. 


As we know from Theorem 4.5.4, Z/=,, is a partition of Z. 


For example, in the case m = 3 we have 


[0]; = {b € Z| b=0 (mod 3)} = {0,3,6,9,..., —3,-—6, —9,...}, 
[1]; = {be Z| b=1 (mod 3)} = {1,4,7,10,..., —2, —5,—8,...}, 
[2]3 = {b € Z| b=2 (mod 3)} = {2,5,8,11,..., —1,—4, -—7,...}. 


Notice that every integer is an element of exactly one of these equivalence 
classes. It follows that every integer is congruent modulo 3 to exactly one 
of the numbers 0, 1, and 2. This is an instance of the following general 
theorem. 


Theorem 7.3.1. Suppose m is a positive integer. Then for every integer a, 
there is exactly one integer r such that 0 < r < mand a =r (mod m). 


Proof. Let a be an arbitrary integer. Let q and r be the quotient and 
remainder when a is divided by m (see exercise 14 in Section 6.4). This 
means that a = qm + r and 0 < r < m. Then a - r = qm, so m | (a - r), and 
therefore a = r (mod m). This proves the existence of the required integer r. 

To prove uniqueness, suppose r, and r, are integers such that 0 <r, < m, 
0 <r, < m,a = r (mod m), and a = r, (mod m). Then by the symmetry and 
transitivity of the equivalence relation =,,, r4 = rọ (mod m), so there is some 
integer d such that r4 — rə = dm. But from 0 <r; < mand 0 < r) < m we see 
that -m < rı — ry < m. Thus -m < dm < m, which implies that -1 < d < 1. 
The only integer strictly between —1 and 1 is 0, so d = 0 and therefore r, — 
rə = dm = 0. In other words, r4 = r». O 


Commentary. Of course, the existence and uniqueness of the number r are 
proven separately, and the proof of uniqueness uses the usual strategy of 
assuming that r4 and r, are two integers with the required properties and 


then proving r4 =r. 
Theorem 7.3.1 says that every integer is congruent modulo m to exactly 


one element of the set {0, 1,..., m — 1}. We say that this set is a complete 
residue system modulo m. 


Note that by Lemma 4.5.5, 
a=r (mod m) iff a €f[r]m iff [aly = [r]m. 


Thus, Theorem 7.3.1 shows that every equivalence class in Z/=,, is equal to 
ied ee m= Ily 
Thus, these m equivalence classes are distinct, and Z/=,, = {[0],,, [1] 
[m - 1p }- 

Consider any two equivalence classes X and Y in Z/=,,. Something 


exactly one of the equivalence classes in the list [0] ,,,, [1] 


meee9 


surprising happens if we add or multiply elements of X and Y. It turns out 
that all sums of the form x + y, where x E X and y € Y, belong to the same 


equivalence class, and also all products xy belong to the same equivalence 
class. In other words, we have the following theorem. 


Theorem 7.3.2. Suppose m is a positive integer and X and Y are elements of 
L/=,. Then: 


1. There is a unique S E Z/=,, such that Vx E XVy E Y(x + y E S). 
2. There is a unique P E Z/=,, such that Vx E XVy E Y(xy E P). 


We will prove this theorem shortly, but first we use it to introduce two 
binary operations on Z/=,,. 


Definition 7.3.3. Suppose X and Y are elements of Z/=,,. Then we define 
the sum and product of X and Y, denoted X + Y and X - Y, as follows: 


X + Y =the unique S € Z/=,, such that Vx € XVy e Y(x+ y € 5), 
X - Y =the unique P € Z/=,, such that Vx € XVy E Y(xy € P). 


The key to our proof of Theorem 7.3.2 will be the following lemma. 


Lemma 7.3.4. Suppose m is a positive integer. Then for all integers a, a ', b, 
and b', ifa' =a (mod m) and b' = b (mod m) then a' +b' = a+b (mod m) 
and a'b'= ab (mod m). 


Proof. Suppose a ' = a (mod m) and b ' = b (mod m). Then m | (a ' — a) and 
m | (b ' — b), so we can choose integers c and d such that a ' - a = cm and b ' 
— b = dm, or in other words a ' = a + cm and b ' = b + dm. Therefore (a ' + b 
')- (a + b) = (a + cm + b + dm) - (a + b) = cm + dm = (c + d)m, so m | ((a ' 
+b ')-(a +b)), which means a ' +a = b ' +b (mod m). Similarly, a ' b ' -ab = 
(a+cm)(b+dm)-ab = adm+bcm+cdm? = (ad+bc+cdm)m, so m | (a ' b' — ab), 
and therefore a ' b ' = ab (mod m). O 


Proof of Theorem 7.3.2. Since X and Y are elements of Z/=,,, we can let a 
and b be integers such that X = [a] „ and Y = [b] ,,. To prove part 1 of the 
theorem, let S = [a + b] „. Now let x E X and y € Y be arbitrary. Then x © 
[a],, and y E [b],,, so x = a (mod m) and y = b (mod m). By Lemma 7.3.4 it 


follows that x + y = a + b (mod m), so x + y E [a + b] „ = S. Since x and y 
were arbitrary, we conclude that Vx E XVy E Y(x+y E S). 


To prove that S is unique, suppose S ' is another equivalence class such 
that Vx E XVy € Y(x+y € S'). Since a € X and b € Y, a+b € S and a+b 


€ S '. Therefore S and S ' are not disjoint, and since Z/=,, is pairwise 


disjoint, this implies that S = S '. 
The proof of part 2 is similar, using P = [ab],,; see exercise 2. 0 


The proof of Theorem 7.3.2 shows that if X = [a],, and Y = [b],,, then the 
sum of X and Y is the equivalence class S = [a + b] „ and the product is P = 
[ab],,. Thus, we have the following theorem. 


Theorem 7.3.5. For any positive integer m and any integers a and b, 
[alm + [b] = la + blm and [a]m < [b]m = lab]m. 


Let’s try out these ideas. Consider the case m = 5. We know that every 
element of Z/=, is equal to either [0]., [1]s, [2]s, [3]s, or [4],, and we will 
often choose to write equivalence classes in one of these forms. For 
example, [2]; + [4], = [6]5, but also 6 = 1 (mod 5), so [6], = [1];. Thus, we 
can say that [2], + [4]; = [1];. Similarly, [2], - [4]; = [8], = [3]. Figure 7.4 
shows the complete addition and multiplication tables for Z/=.. 


[2]s [3]s 


[O]s | [0]s [Ols [0]s [O]s [0]s 
[1s | (Os (lls [2]s [3]s [4]s 
[2]s | [0]s [2]s [4]s Us [3s 
[3]s | [0]s [3]s [l]s [41s [2]s 
[4]s | [0]s [4s [3]s [2]s [l]5 


Figure 7.4. Addition and multiplication tables for Z/=.. 


How do addition and multiplication in Z/=,, compare to addition and 
multiplication in Z? Many properties of addition and multiplication in Z 
carry over easily to Z/=,,. 


Theorem 7.3.6. Suppose m is a positive integer. Then for all equivalence 
classes X, Y, and Z in Z/=,,: 


1. X + Y = Y + X. (Addition is commutative.) 
2.(X + Y)+Z=X+(Y+ Z). (Addition is associative.) 
3. X + [0], = X. ((0],, is an identity element for addition.) 


4. There is some X' E Z/=,, such that X + X ' = [0],,. (X has an additive 
inverse.) 


5. X ; Y = Y ; X. (Multiplication is commutative.) 

6. (X- Y)-Z=X- (Y:-: Z). (Multiplication is associative.) 

7. X: [1], =X. (1), is an identity element for multiplication.) 

8. X - [0],, = [Olm 

9. X-(Y¥ + Z) = (X- Y) + (X- Z). (Multiplication distributes over addition.) 


Proof. Since X, Y, Z E Z/=,,, there are integers a, b, and c such that X = 
lalm Y = [b] „m and Z = [c] ,,. For part 1, we use the commutativity of 
addition in Z: 


X +Y = [a]m + [b]m = la + blm = [b + a]m = [b]m + lalm = Y + X. 
The proof of 2 is similar. To prove part 3, we compute 
X + [Olm = [alm + [0]m = [a + Om = laln = X. 
For part 4, let X’ = [-a] m. Then 
X + X’ = [a]m + [—a]m = [a + (—a)]m = Om. 


The proofs of the remaining parts are similar (see exercise 3). 0 


You are asked to show in exercise 4 that the identity elements and 
inverses in Theorem 7.3.6 are unique. Thus, in part 3 of the theorem we can 
say that [0] „ is not just an identity element for addition, but the identity 


element, and similarly [1],, is the identity element for multiplication. In part 
4, we can say that X ' is the additive inverse of X; we will denote the 
additive inverse of X by —X. For example, according to the addition table 
for Z/=, in Figure 7.4, [4], + [1], = [0];, so -[4], = [1]s. 

What about multiplicative inverses? If X E Z/=,,, X' © Z/=,,, and X: X' 
= [1] then we say that X ' is a multiplicative inverse of X. For example, 
according to the multiplication table for Z/=. in Figure 7.4, [3]; - [2]; = 
[1], so [2], is a multiplicative inverse of [3];. In fact, in Z/=;, every 
element except [0], has a multiplicative inverse. Multiplicative inverses, 
when they exist, are also unique (see exercise 4), so we can say that [2]; is 
then the 


multiplicative inverse of X, if it exists, is denoted X_'. Thus [3];' = [2]s. 


the multiplicative inverse of [3]. In general, if X E Z/=,, 


A little experimentation reveals that multiplicative inverses often don’t 
exist. For example, we leave it for you to check that in Z/=,, only [1], and 
[5], have multiplicative inverses (see exercise 1). When does an 


equivalence class have a multiplicative inverse? The answer is given by our 
next theorem. 


Theorem 7.3.7. Suppose that a and m are positive integers. Then [a] ,, has 
a multiplicative inverse iff m and a are relatively prime. 


Proof. Suppose first that [a],, has a multiplicative inverse; say 
(a)! = [a"]m. Then [a], ° [a ']m = [aa "lm = [1] m and therefore aa ' = 1 
(mod m). This means that m | (aa ' —-1), so we can choose some integer c 
such that aa ' -1 = cm, or equivalently -cm + a ' a = 1. Thus 1 is a linear 
combination of m and a, and by exercise 6 in the last section it follows that 
m and a are relatively prime. 

For the other direction, assume that m and a are relatively prime. Then by 
Theorem 7.1.4 there are positive integers s and t such that sm + ta = 1. 
Therefore ta—1 = -sm, so ta = 1 (mod m). We conclude that [a] n > [tlm = 
[ta],, = [1], so [t] „is the multiplicative inverse of [a],,. O 


Commentary. Notice that the conclusion of the theorem is a biconditional 
statement, and the proof uses the usual strategy of proving both directions 
of the biconditional separately. 


The proof of Theorem 7.3.7 shows that for any positive integers m and a, 
we can use the extended Euclidean algorithm to find [a]>'. If the algorithm 


m 


shows that gcd(m, a) Z 1 then [a];,! doesn’t exist, and if we find that gcd(m, 


m 


a)=1=sm + ta then {a}! = [t]m. 


m 


Example 7.3.8. Find, if possible, the multiplicative inverses of [34],,7 and 
[35 ]e47 in L= 9,7. 


Solution 


Figure 7.5 shows the calculation of gcd(847, 34) by the extended Euclidean 
algorithm. We conclude that gcd(847, 34) = 1 = 11 - 847 - 274 - 34, and 
therefore [34] 54 = [—274]g47 = [573]g47. AS you can easily check, 34:573 = 
19482 = 1 (mod 847), so [34] 947 * [573] 47 = [19482 ]o47 = [1]g47. 


n qn Fn Sn fa Division 


0 847 | 0 

| 34 0 1 847 = 24-34 +31 
2 24 3l l —24 34=1-31+3 
3 l 3 -!1 25 31=10-3+1 
4 10 | 11 —274 3=3.1+0 

5 3 0 


Figure 7.5. Calculation of gcd(847, 34) by extended Euclidean algorithm. 


We leave it to you to compute that gcd(847, 35) = 7. Therefore [35]g,7 
does not have a multiplicative inverse. 


Example 7.3.9. A class has 25 students. For Easter, the teacher bought 
several cartons of eggs, each containing a dozen eggs, and then distributed 
the eggs among the students for them to decorate. After giving an equal 
number of eggs to each student, she had 7 eggs left over. What is the 
smallest number of cartons of eggs she could have bought? 


Solution 


Let x be the number of cartons of eggs the teacher bought. Then she had 
12x eggs, and setting aside the 7 left over at the end, the remaining eggs 
were divided evenly among 25 students. Therefore 25 | (12x — 7), so 12x = 
7 (mod 25). We must find the smallest positive integer x satisfying this 
congruence. 

If we were solving the equation 12x = 7 for a real number x, we would 
know what to do. If 12x = 7, then by multiplying both sides of the equation 
by 1/12 we conclude that x = 7/12. In fact, this reasoning can be reversed: if 
x = 7/12, then multiplying by 12 we get 12x = 7. Thus, the equations 12x = 
7 and x = 7/12 are equivalent, which means that x = 7/12 is the unique 
solution to the equation 12x = 7. 

Unfortunately, we are working with the congruence 12x = 7 (mod 25), 
which is not an equation. But we can turn it into an equation by working 
with equivalence classes. Our congruence is equivalent to the equation 
[12]55 © [X]o5 = [7]o5, and we can solve this equation by imitating our 
solution to the equation 12x = 7. We begin by finding the multiplicative 
inverse of [12]>5. Applying the extended Euclidean algorithm, we find that 


gcd(25, 12) =1=1-25-2- 12, so [12)5! = [—2hs = [23]25. 

To solve the equation [12]5. - [xlo5 = [7]o5, we multiply both sides by 
| i2 = [23]o5. We spell out all the steps in detail, to make it clear how the 
properties in Theorem 7.3.6 are being used: 


[12]25 - [x]25 = [7]25, 
[12]5¢ - (U2)25 + [x]25) = [12]5¢ - [7]25, 
(23 -+ [12]25) - LxJos = [23]25 - [7]25, 
[1]25 - [x]l25 = [161 ]25 = [11 Jos. 
[x]25 = [11]25. 


As before, these steps can be reversed: multiplying both sides of the 
equation [x]>s5 = [11]s by [12]>s5 gives us [12]>s5 - [x]o5 = [7]>5. Therefore 


12x =7 (mod 25) iff [12]2s-[x]25s = [7]25 iff [x]25 = [11]25 iff x € [1] ]os. 


In other words, the solutions to the congruence 12x = 7 (mod 25) are 
precisely the elements of the equivalence class [11];, and the smallest 
positive solution is x = 11. If the teacher bought 11 cartons of eggs, then she 
had 132 eggs, and after giving 5 to each student she had 7 left over. 


We were lucky in this example that 25 and 12 were relatively prime, so 
that [12]. had a multiplicative inverse. This multiplicative inverse played a 
crucial role in our solution of the congruence 12x = 7 (mod 25). How can 
we solve a congruence ax = b (mod m) if m and a are not relatively prime? 
We won’t analyze such congruences in detail, but we’ll give a couple of 
examples illustrating how such congruences can be solved by using the 
following two theorems. 


Theorem 7.3.10. Suppose m and a are positive integers, and let d = gcd(m, 
a). Then for every integer b, if d ł b then there is no integer x such that ax = 
b (mod m). 


Proof. See exercise 7. O 


Theorem 7.3.11. Suppose n and m are positive integers. Then for all 
integers a and b, 


na=nb (mod nm) iff a=b (mod m). 


Proof. See exercise 8.0 


Example 7.3.12. Solve the following congruences: 
77x = 120 (mod 374), 77x = 121 (mod 374). 


Solution 


We begin by computing that gcd(374, 77) = 11. Since 11 + 120, Theorem 
7.3.10 tells us that the first congruence, 77x = 120 (mod 374), has no 
solutions. To solve the second congruence, we first write it as 11 - 7x = 11- 
11 (mod 11 - 34) and then observe that by Theorem 7.3.11, this is equivalent 
to 7x = 11 (mod 34). To solve this congruence, we compute that gcd(34, 7) 
=1=-1-34+5-7,s0 [7];,' = [5]s4. Therefore 


7x = 11 (mod 34) iff [7]s4- [x]34 = [1 1]34 
iff [x]34 = [7]34 - [11 ]34 = [5]34 - (11134 = [55]34 = [21 ]34 


iff x € [21]34. 


Thus the solutions to the second congruence are the elements of [21]3,. 


Exercises 


1. 


(b) 


Make addition and multiplication tables for Z/=¢. 


Complete the proof of Theorem 7.3.2. 

Prove parts 5—9 of Theorem 7.3.6. 

Suppose m is a positive integer. 

Suppose Z, and Z, are both additive identity elements for Z/=,,; in 
other words, for all X E Z/=,,, X + Z, = X and X + Z, = X. Prove that 
Z, = Z. This shows that the additive identity element in Z/=,, is 
unique. (Hint: Compute Z, + Z, in two different ways.) 

Suppose X E€ Z/=,, and Xi and X4 are both additive inverses for X; in 
other words, X + Xi = X + X4 = [0]m. Prove that X = X4. This shows 
that the additive inverse of X is unique. (Hint: Compute Xi +X + X% 


in two different ways.) 
Prove that the multiplicative identity element in Z/=,, is unique. 


Prove that if an equivalence class X E Z/=,, has a multiplicative 
inverse, then this inverse is unique. 

Show that if p is a prime number then every element of Z/=,, except [0] 

p has a multiplicative inverse. 

If ab = 0 (mod m), is it necessarily true that either a = 0 (mod m) or b = 


0 (mod m)? Justify your answer with either a proof or a 
counterexample. 


Prove Theorem 7.3.10. 
Prove Theorem 7.3.11. 


9. A class has 26 students. The teacher bought some packages of file 
cards, each of which contained 20 file cards. When he passed the cards 
out to the students, he discovered that he needed to add 2 additional 
cards from his desk to be able to give each student the same number of 
cards. If each student got between 10 and 20 cards, how many packages 
did he buy? 

*10. Solve the following congruences. 

(a) 40x = 8 (mod 237). 

(b) 40x = 8 (mod 236). 

11. Solve the following congruences. 

(a) 31x = 24 (mod 384). 

(b) 32x = 24 (mod 384). 

12. In this exercise you will solve the following problem: Suppose a chair 
without arms costs $35 and a chair with arms costs $50. If Alice spent 
$720 on chairs, how many of each kind of chair did she buy? 

(a) Show that if x is the number of chairs without arms that she bought, 
then 35x = 20 (mod 50). 

(b) Solve the congruence in part (a). 

(c) Not every solution to the congruence in part (a) leads to a possible 
answer to the problem. Which ones do? (Note: There is more than one 
possible answer to the problem.) 

13. Suppose m and n are relatively prime positive integers. Prove that for 
all integers a and b, a = b (mod m) iff na = nb (mod m). 

14. Suppose that m, and m, are positive integers. Prove that for all integers 
a and b, if a = b (mod m,) and a = b (mod m») then a = b (mod Icm(m,, 
m>)). (Hint: Use exercise 11 in Section 7.2.) 

15. Prove that for all positive integers m, a, and b, if a = b (mod m) then 
gcd(m, a) = gcd(m, b). 

16. Suppose a = b (mod m). Prove that for every natural number n, a” = b” 
(mod m). 

In exercises 17—19, we use the following notation. If do, d4, . . . , dg E {0, 1, 


, 9}, then (dx : + d4 dp)i9 is the number whose representation in decimal 


notation is dy : + © d4 do. In other words, 


17. 


(a) 
(b) 


18. 


(a) 
(b) 
(c) 
19. 


(a) 
(b) 


(c) 
20. 


(b) 


21. 


(a) 


(b) 


(d; - - - dıdo)10 = do + 10d; +--+ + 10% dy. 


Suppose n = (d: : + dd)jo- 


Show that n = (dọ + d} + : - +d;) (mod 3). 

Show that 3 | n iff 3 | (dọ + d, +: - - +d;). (This gives a convenient way 
to test a natural number for divisibility by 3: add up the digits and 
check if the digit sum is divisible by 3.) 

Suppose n = (dx * + dy dojo- 

Show that n = (dọ — d4 + d; — d3 +» » +(-1)* d,) (mod 11). 

Show that 11 | n iff 11 | (dọ - d, + + © +(-1)* d,). 

Is 535172 divisible by 11? 

Define a function f with domain {n E Z | n = 10} as follows: if n = (d, - 
-+ di do)io then f(n) = (dp + + dy)19 +5dọ. For example, f(1743) = 174 + 
5:3= 189. 

Show that for all n > 10, f(n) = 5n (mod 7) and n = 3f(n) (mod 7). 

Show that for all n > 10, 7 | n iff 7 | Kn). (This gives a convenient way 
to test a large integer n for divisibility by 7: repeatedly apply f until 


you get a number whose divisibility by 7 is easy to determine.) 
Is 627334 divisible by 7? 


(a) Find an example of positive integers m, a, a ', b, and b ' such that a 
' = a (mod m) and b ' = b (mod m) but b, ({a],,,)!@! = [a? |p. 

Show that it is impossible to define an exponentiation operation on 

equivalence classes in such a way that for all positive integers m, a, 

and (a’)” £ a’ (mod m). 

Suppose m is a positive integer. Define f: Z x Z > Z/=,, by the formula 

f(a, b) = [a+b] „, and define h: (Z/=,,)x(Z/=,,) > Z/=,, by the formula 

h(X, Y) = X + Y. You might want to compare this exercise to exercise 21 

in Section 5.1. 

Show that for all integers x4, X>, yj, and y», if x, =m y; and X> =,, Y2 then 

f(x, X2) = fY yo). (Extending the terminology of exercise 21 in 

Section 5.1, we could say that fis compatible with =,,..) 

Show that for all integers x, and xX, A([X4] m [Xo] m = A(X X2). 


7.4. Euler’s Theorem 


In the last section, we saw that some elements of Z/=,, have multiplicative 
inverses and some don’t. In this section, we focus on the ones that do. We 
let (Z/=,,)* denote the set of elements of Z/=,, that have multiplicative 
inverses. In other words, 


(Z/=m)* = {X € Z/=m | for some X’ € Z/=m, X - X’ = [1] m}. 
The number of elements of (Z/=,,)* is denoted g(m). The function @ is 


called Euler’s phi function, or Euler’s totient function; it was introduced by 
Euler in 1763. For every positive integer m, (Z/=,,)* © Z/s,, and Z/=,, has 


m elements, so g(m) < m. And [1], : 11m = [1] m so Mn E (Z/=,)* and 
therefore @(m) > 1. For example, 


(Z/=10)* = {[1]10, [3)10. [7]10. [9]10}, 


so (10) = 4. 
For our purposes, the most important properties of (Z/=,,)* are that it is 
closed under inverses and multiplication. That is: 


Theorem 7.4.1. Suppose m is a positive integer. 

1. For every X in (Z/=,,)*, Xt € (Z/=,,)*. 

2. For every X and Y in (Z/=,,)*, X - Y € (Z/=,,)*. 
Proof. 


1. Suppose X € (Z/=,,)*. Then X has a multiplicative inverse Xt, and X 
-X 1 =[1],,. But this equation also tells us that X is the multiplicative 
inverse of X t; in other words, (XH! = X. Therefore Xt E (Z/=,,)*. 


2. Suppose X € (Z/s,,)* and Y € (Z/s,,)*. Then X and Y have 


multiplicative inverses X t and Y t. Therefore 


(X-¥)-(X Vo a KX) -Y -Y5 = - (he = [- 


This means that Xt - Y is the multiplicative inverse of X - Y, so (X- 
Y*=X'+¥ *andx~Y €(Z=.)*.0 


Suppose X € (Z/=,,)*. By Theorem 7.4.1, for every Y € (Z/=,,)*, XY © 
(Z/=,,)*, so we can define a function fy: (Z/=,,)* > (Z/=,,)* by the 
formula fy (Y) = X - Y. Let’s investigate the properties of this function. 

We claim first that fy is one-to-one. To see why, suppose Y} E (Z/=,,)*, 
Y> € (Z/=,,)*, and fy (Y1) = fy (Y2). Then X - Y, = X - Yo, and therefore 


Yi = [1]m -< Yı = XT! - X -Yı = XT! - X - Y2 =[l]m- Y2 = Y2. 


This proves that fx is one-to-one. Next, we claim that fy is onto. To prove 
this, suppose Y € (Z/=,,)*. Then since (Z/=,,)* is closed under inverses 
and multiplication, Xt - Y € (Z/=,,)*, and 


fx(X!-Y)=X.-X!-Y =[1]m- Y =Y. 


Thus, fy is onto. 

For example, consider again the case m = 10, and let X = [3];o. Applying 
fx to the four elements of (Z/=,9)* gives the values shown in Figure 7.6. 
Notice that, since fy is one-to-one and onto, each of the four elements of (Z/ 
=10)* appears exactly once in the column under fy (Y); each element 
appears at least once because fy is onto, and it appears only once because fy 


is one-to-one. Thus, the entries in the second column of Figure 7.6 are 
exactly the same as the entries in the first column, but listed in a different 
order. 


Y fx(Y) 


(Mio | (3ho- (lio = Bhio 
[310 | [Bho [Bho = [9hio 
[7hio | (3hio- (7ho = [10 
(Mio | [3hio- [910 = [7hio 


Figure 7.6. Values of fy when X = [3]1o. 


More generally, suppose m is a positive integer and X € (Z/=,,)*. By the 
definition of Euler’s phi function, there are g(m) elements in (Z/=,,)*. Let 
Y1, Yo, . . . , Yom) be a list of these elements. Then since fy is one-to-one and 
onto, each of these elements occurs exactly once in the list fy (Y,), fy (Yo), - 
. -> fx Yom). In other words, the two lists Y4, Yo, . . . , Yom) and fx (Y1), fx 
(Y>),..-» fx (Yom) contain exactly the same entries, but listed in different 


orders — just like the two columns in Figure 7.6. It follows, by the 
commutative and associative laws for multiplication, that if we multiply all 
of the entries in each of the two lists, the products will be the same (see 
exercise 21 in Section 6.4): 


Yi- ¥o-++ Youn) = fx Y1) - fx O2) -+ fx Youm)) 

=(X- Yı) (X. Y2) +++ (X. Yo(m)) 

= xem) (Yi. Y2 rra Yo(m))s 
where of course by X?¢") we mean X multiplied by itself g(m) times. To 
simplify this equation, let Z = Y; - Y, - + - Yoqm). Then the equation says Z = 
x") . Z, Since (Z/=,,)* is closed under multiplication, Z € (Z/=,,)*, so it 
has an inverse. Multiplying both sides of the equation Z = X%” - Z by Z£, 
we get 

[1] = L j z7! — ye) ° Z . z= = xem) . [1] = xem) 


Thus, we have proven the following theorem. 


Theorem 7.4.2. Suppose m is a positive integer and X € (Z/=,,)*. Then 
wm = [1], 


To understand the significance of this theorem, it may help to rephrase it 
in terms of numbers. 


Theorem 7.4.3. (Euler’s theorem) Suppose m is a positive integer. Then for 
every positive integer a, if gcd(m, a) = 1 then a?” = 1 (mod m). 


Proof. Suppose a is a positive integer and gcd(m, a) = 1. Then by Theorem 
7.3.7, [alm © (Z/=m)*, so by Theorem 7.4.2, [a]f’? = [1]m, where [aj 


m 


denotes [a],, multiplied by itself ọ(m) times. But 


(mn) 9 
[alf m = la]m + la]m ++ la]m = la-a-++a)m = Ead 
_—_— 0k 


v(m) terms p(n) terms 


(For a more careful proof of this equation, see exercise 5.) Thus, [a?”?],,, = 
taj” = [1]„, and therefore a” = 1 (mod m). O 


m 


For example, 10 and 7 are relatively prime, so according to Euler’s 
theorem, 7200) should be congruent to 1 modulo 10. To check this, we 
compute 


7200) — 74 = 2401 = 1 (mod 10). 


To apply Euler’s theorem, we need to be able to compute (m). Of 
course, we can check all the elements of Z/=,, one-by-one and count how 


many have multiplicative inverses, as we did in the case m = 10, but for 
large m this will be impractical. We devote the rest of this section to finding 
a more efficient way to compute ọ(m). 

We begin by rephrasing the definition of g(m). We know that {0, 1,..., 
m — 1} is a complete residue system modulo m, but since 0 = m (mod m), 
we can also say that {1, 2, ..., m} is a complete residue system. Thus, Z/ 
=, 7 (lm in --- [M - U,, lm), } = {lalm | 1 < a < m}, where each 
element of Z/=,, appears exactly once in this list of elements. To identify 
which of these elements are in (Z/=,,)*, we use Theorem 7.3.7, which tells 
us that for any positive integer a, [a] „ has a multiplicative inverse iff m and 
a are relatively prime. Thus, 


(Z/=m)* = {la]m | 1 < a < mand gcd(m,a) = 1}. 
This gives us another way to understand Euler’s phi function: 
~(m) = the number of elements in the set {a | 1 < a < mand gcd(m, a) = 1}. 


Using this characterization of the phi function, it is easy to compute (p) 
when p is prime: If 1 <a <p — 1 then p + a, and therefore gcd(p, a) = 1, but 
gcd(p, p) = p > 1. Therefore 


{a| 1 <a< pand gcd(p,a) = 1} = {1,2,..., p— 1}, 


so ọ(p) = p-1. In fact, it is almost as easy to compute ọ(p%) for any positive 

integer k. If a is a positive integer and p | a then gcd(p*, a) = p > 1, but if p + 

a then the only common divisor of p* and a is 1, so gcd(p‘, a) = 1. Thus the 

elements of the set {a | 1 < a < p* } that are not relatively prime to p* are 

precisely the ones that are divisible by p, and those elements are p, 2p, 3p, . 
., px = p% p. In other words, 


fa|l<a< ps and gcd(p*,a) = 1} = {1,2,..., p*} \ {p,2p,..., p*—! p}, 


and the number of elements in this set is p‘ - p% t = pk! (p - 1). Thus o(p% 
= p< (p~ 1). 

To compute @(m) for other values of m, we use the following theorem, 
which we will prove later in this section. 


Theorem 7.4.4. Suppose m and n are relatively prime positive integers. 
Then (mn) = o(m) - (n). 


A function f from the positive integers to the real numbers is called a 
multiplicative function if it has the property that for all relatively prime 
positive integers m and n, f(mn) = f(m) - f(n). Thus, Theorem 7.4.4 says that 
Euler’s phi function is a multiplicative function. A number of other 
important functions in number theory are also multiplicative, but @ is the 
only one we will study in this book. (For two more examples, see exercises 
16 and 17.) 

Theorem 7.4.4 allows us to use the prime factorization of any positive 
integer m to find @(m). Suppose the prime factorization of m is 


m = p\'ps--- pi’, where pj, Po, ..., Py are prime numbers and p; < p; <- 


k 


+ < py: Then p‘! and p7 --- py 
prime factors in common (see exercise 5 in Section 7.2), 
v(m) = v(p;')-v(py +- pg). Repeating this reasoning we conclude that 


olm) = ppi py «+ PE) = G(P}') - PPP) += - PPE) 


= ps! "(pı — ])- p3p — 1) -++ py 


“pk —1). 
For example, 600 = 2° - 3 - 52, so 


(600) = (2° -3-5 (2 —1)-39(3 — 1)-5!(5 — 1) = 160. 


That was a lot easier than explicitly listing the 160 elements of (Z/=¢99)* 


Our proof of Theorem 7.4.4 will depend on three lemmas. 


are relatively prime, because they have no 


SO 


! 


Lemma 7.4.5. Suppose m and n are relatively prime positive integers. Then 
for all integers a and b, a = b (mod mn) iff a = b (mod m) and a = b (mod 


n). 


Proof. See exercise 6. 0 


Lemma 7.4.6. For all positive integers a, b, and c, gcd(ab, c) = 1 iff gcd(a, 


c) = 1 and gcd(b, c) = 1. 


Proof. See exercise 7. O 


Lemma 7.4.7. Suppose m and n are relatively prime positive integers. Then 
for all integers a and b, there is some integer r such that 1 <r<mn,r=a 


(mod m), and r = b (mod n). 


Proof. Let a and b be arbitrary integers. Since m and n are relatively prime, 
there are integers s and t such that sm+ tn = 1. Therefore tn — 1 = -sm and 


sm-1=-tn. 
Let x = tna + smb. Then 


x—a = (tn — l)a + smb = —sma + smb = sm(b—a), 


so m | (x — a), and therefore x = a (mod m). Similarly, 


x— b = tna + (sm — l)b=tna — tnb =tn(a — b), 


so n| (x — b) and x = b (mod n). 

Since {1, 2,..., mn} is a complete residue system modulo mn, we can 
find some integer r such that r = x (mod mn) and 1 < r < mn. By Lemma 
7.4.5, r = x (mod m) and r = x (mod n), and by the transitivity of =,, and =, 
it follows that r = a (mod m) and r = b (mod n). 0 


Commentary. After the introduction of the arbitrary integers a and b, the 
goal is an existential statement. As is common in proofs of existential 
statements, the proof introduces a number x without providing any 
motivation for the choice of x. The number x turns out to have most of the 
properties we want, but perhaps not all of them, since it might not be 
between 1 and mn. We therefore need an extra step to come up with the 
number r that has all of the required properties. 


We will need one more idea for our proof of Theorem 7.4.4. Suppose A is 


a set with p elements and B is a set with q elements; say A = {q}, d,..., ap 
} and B = {b}, by,..., bg }. Then A x B has pq elements. To see why, 


imagine arranging the elements of AxB in a table, with the ordered pair (a;, 
b;) in row i, column j of the table. Since the table will have p rows and q 


columns, AxB must have pq elements. For a more careful proof of this fact, 
see exercise 22 in Section 8.1. 


We are now ready to prove that ọ is a multiplicative function. 


Proof of Theorem 7.4.4. Let R = {a | 1 < a < mn and gcd(mn, a) = 1}. By 
Lemma 7.4.6, if a E R then gcd(m, a) = 1 and gcd(n, a) = 1, so [a], © (Z/ 


=,,)* and [a],, € (Z/=,,)*. Thus we can define a function f. R > (Z/=,,)* x 
(Z/=,,)* by the formula f(a) = ({a],,,, [a] „). Our plan is to show that f is one- 
to-one and onto, which implies that the sets R and (Z/=,,)* x (Z/=,)* have 
the same number of elements. But R has g(mn) elements and (Z/=,,)* x (Z/ 
=,)* has g(m) - (n) elements, so this will establish that g(mn) = ọ(m) - 
p(n). 


To show that f is one-to-one, suppose a, € R, ay E R, and f(a) = fa). 
This means that ([a,],,, [a,],,) = (Lazlm [ao] n), so lay], = [ao], and [a] , = 
[a>], and therefore a, = ay (mod m) and a, = ay (mod n). By Lemma 7.4.5 
it follows that a, = a) (mod mn). But since {a | 1 < a < mn} is a complete 


residue system modulo mn, no two distinct elements of R are congruent 
modulo mn, so a, = ay. This completes the proof that f is one-to-one. 


Finally, to show that f is onto, let ({a],,,, [b],,) be an arbitrary element of 
(Z/=,,)* x (Z/=,)*. By Lemma 7.4.7, there is some integer r such that 1 < r 
< mn, r = a (mod m), and r = b (mod n). Therefore [r]„ = [a], © (Z/=,,)* 
and [r] , = [b] , E (Z/=,)*, so by Theorem 7.3.7, gcd(m, r) = gcd(n, r) = 1. 


Applying Lemma 7.4.6, we conclude that gcd(mn, r) = 1. Therefore r E R 
and f(r) = ([r],,, [rla = (lal m [b],,), which shows that fis onto. O 


Exercises 


1. List the elements of (Z/=,)*. 
*2. Find o(m): 
(a) m=539. 


(b) m=540. 
(c) m= 541. 


3. Check these instances of Euler’s theorem by computing a?” and 
verifying that a?) = 1 (mod m). 

(a) m=18,a=5. 

(b) m=19,a=2. 

(c) m=20,a=3. 

4. Check these instances of Lemma 7.4.7 by finding an integer r such that 
1<r<mn,r=a(modm), andr = b (mod n). 

(a) m=5,n=8,a=4,b=1. 

(bì) m=7,n=10,a=6,b=4. 

5. Suppose m and a are positive integers. Use mathematical induction to 
prove that for every positive integer n, [a]",,, = [a"],,. 


*6, 


*8, 


15. 


(a) 


(b) 
16. 


Prove Lemma 7.4.5. 


Prove Lemma 7.4.6. 

Show that if we drop the hypothesis that m and n are relatively prime 
from Lemma 7.4.5, then one direction of the “iff” statement is correct 
and one is not. Justify your answer by giving a proof for one direction 
and a counterexample for the other. 


If we drop the hypothesis that m and n are relatively prime from Lemma 
7.4.7, is the lemma still correct? Justify your answer by giving either a 
proof or a counterexample. 

Prove Fermat’s little theorem, which says that if p is a prime number, 
then for every positive integer a, a? = a (mod p). 

Prove that if m and a are relatively prime positive integers, then 
[a]; = [a-n]. 

Prove that for all positive integers m, a, p, and q, if m and a are 
relatively prime and p = q (mod ọ(m)) then a? = aù (mod m). 


Prove that if a, b,, b>, . . . , bọ are positive integers and gcd(a, b,) = 
gcd(a, b») =: : + = gcd(a, by) = 1, then gcd(a, b4 b> ++: bp) =1. 
Suppose that m,, m>, . . . , m, are positive integers that are pairwise 


relatively prime; i.e., for all i, j © {1, 2, . . . , k}, if i # j then gcd(m,, m;) 
= 1. Let M = m; m, : : > m,. Prove that for all integers a and b, a = b 
(mod M) iff for every i E {1, 2, . . . , k}, a = b (mod m;). 

In this exercise you will prove the Chinese remainder theorem. (The 
theorem was first stated by the Chinese mathematician Sun Zi in the 
third century.) 

Suppose that m,, mə, . . . , m, are positive integers that are pairwise 
relatively prime; i.e., for all i, j E {1, 2,..., k}, if i #j then gcd(m,, 
mj) = 1. Let M = m, m, `: + m,. Prove that for all integers aj, ap, . . . , ag 
there is an integer r such that 1 < r < M and for all i E {1, 2,..., k}, r 
= a; (mod m;). (Hint: Use induction on k. In the induction step, use 
Lemma 7.4.7. You will also find exercises 13 and 14 helpful.) 

Prove that the integer r in part (a) is unique. 

For each positive integer n, let t(n) = the number of elements of D(n). 
For example, D(6) = {1, 2, 3, 6}, so 1(6) = 4. In this exercise you will 


(a) 
(b) 


(c) 
17. 


18. 


19. 


(a) 


(b) 
(c) 


(d) 
(e) 


prove that t is a multiplicative function. Suppose m and n are relatively 
prime positive integers. 

Prove that if a E D(m) and b E D(n) then ab E D(mn). 

By part (a), we can define a function f: D(m) x D(n) > D(mn) by the 
formula f(a, b) = ab. Prove that f is one-to-one and onto. 

Prove that t(mn) = t(m)-t(n), which shows that t is multiplicative. 


For each positive integer n, let o(n) = the sum of all elements of D(n). 
For example, D(6) = {1, 2, 3, 6}, so o(6)= 1 +2 + 3 + 6 = 12. Prove 
that o is a multiplicative function. (Hint: Use the function f from part 
(b) of exercise 16.) 

In this exercise you will prove Euclid’s theorem on perfect numbers. 
Recall that a positive integer n is called perfect if n is equal to the sum 
of all divisors of n that are smaller than n. Equivalently, n is perfect if 
a(n) = 2n, where o is the function defined in exercise 17. Prove that if p 
is a positive integer and 2? — 1 is prime, then 2?! (2P — 1) is perfect. 
(Hint: You will find exercise 17 and Example 6.1.1 useful.) 

In this exercise you will prove Euler’s theorem on perfect numbers. 
Suppose n is an even perfect number. (As in exercise 18, to say that n is 
perfect means that o(n) = 2n, where o is the function defined in exercise 
17.) 

Prove that there are positive integers k and m such that n = 2‘ m and m 
is odd. 

Prove that 2%"! m = (2!*! - 1)o (m). 

Prove that 2*! | o(m). Thus there is a positive integer d such that o(m) 
= 2k+l d. 

Prove that m = (2**! — 1)d. 

Prove that d = 1. (Hint: Suppose d > 1. Then 1, d, and m are distinct 
divisors of m, so o(m) > 1+d +m. Derive a contradiction.) 


(f) Let p =k + 1. Then by parts (a), (d), and (e), n = 2?! (2P — 1). Prove 


that 2? — 1 is prime. Thus n is a perfect number of the form considered 
in exercise 18. 


7.5. Public-Key Cryptography 


Suppose you want to make a purchase online. You go to the merchant’s 
website and place your order. Then the website asks you to enter your credit 
card number. You type the number on your computer, and your computer 
must transmit the number over the internet to the merchant’s computer. 

Internet communications generally pass through several computers on 
their way from the sender to the recipient. As a result, there is a possibility 
that someone with access to one of those intermediary computers could be 
eavesdropping when your computer sends your credit card number to the 
merchant. To keep such an eavesdropper from stealing your credit card 
number, your computer scrambles, or encrypts the number before sending 
it. The merchant’s computer then unscrambles, or decrypts the number and 
charges your credit card. 

For example, suppose your credit card number is the 16-digit sequence m 
= mı M,°** Mg. Each m; is one of the digits 0, 1, 2,..., 9, but we will 
think of it as representing the equivalence class [mjljyg © Z/= 49. If your 


computer and the merchant’s computer could agree on a random sequence 
of digits k = k, kə - + + ky, then they could proceed as follows, doing all 


calculations in Z/=,). Your computer could replace the ith digit m; of your 
credit card number with the digit c; such that [cilo = [Milio + [kilio Your 
computer would send the 16-digit sequence c = c} C> ‘°° Cig to the 


merchant’s computer, which would then recover the original sequence m by 
using the formula [m,])9 = [Cilio + (-[kKj]19). The sequence k is the key that 
your computer uses to encrypt the credit card number and the merchant’s 
computer uses to decrypt it. An eavesdropper who didn’t know the key k 
would be unable to decrypt the encrypted message c and learn your credit 
card number m. 


But how can your computer and the merchant’s computer agree on the 
key k? If one computer chooses the key and sends it to the other, then an 
eavesdropper could learn the key and then decrypt the encrypted message. 
Sending the key securely is just as hard as sending the credit card number, 
so we don’t seem to have made any progress. 

The problem with this scheme is that it uses symmetric cryptography, in 
which the same key is used for both encryption and decryption. The 
solution to the problem is to use public-key cryptography, in which the 
encryption and decryption keys are different. The merchant’s computer 


creates two keys, one for encryption and one for decryption. It sends the 
encryption key to your computer. Your computer uses the encryption key to 
encrypt your credit card number and then sends the encrypted number to the 
merchant’s computer, which uses the decryption key to recover the credit 
card number. An eavesdropper could learn the encryption key, so this key is 
regarded as a public key. But this doesn’t help the eavesdropper, because 
decryption requires the decryption key, and this key is never transmitted 
and remains secret. 

It may seem surprising that it is possible to have different keys for 
encryption and decryption, but it can be done. In this section we discuss one 
well-known public-key encryption system called RSA. It is named for Ron 
Rivest (1947—), Adi Shamir (1952), and Leonard Adleman (1945-—), who 
developed the system in 1977. A similar system was developed in 1973 by 
Clifford Cocks (1950-), a mathematician working for the British 
intelligence agency, but it was classified and not revealed until 1997. As we 
will see, the RSA system is based on Euler’s theorem. 

We have introduced the idea of public-key cryptography in the context of 
internet purchases, but it can be used any time one person wants to send a 
message to another while preventing an eavesdropper from reading the 
message. Suppose Alice wants to send a message securely to Bob. To use 
the RSA public-key system, they would proceed as follows. First Bob 
chooses two distinct prime numbers p and q. He computes n = pq and @(n) 
= (p — 1)(q — 1). Next, he chooses a positive integer e such that e and (n) 
are relatively prime and e < g(n). By Theorem 7.3.7, [elgm) has a 


multiplicative inverse in Z/= which can be computed by the extended 


p(n) 
Euclidean algorithm. Thus, Bob can compute a positive integer d such that 
d < g(n) and [elgg ` [loc = (ony, which means that ed = 1 (mod 9(n)). 
Bob sends the pair of numbers (n, e) to Alice; this is the encryption key that 
Alice will use to encrypt her message. He keeps the numbers p, q, and d 
secret; he will use d to decrypt Alice’s message. 

We will assume that the message Alice wants to send is a natural number 
m < n. Of course, her message might actually be a piece of text, not a 
number, but a piece of text can be encoded as a natural number. If the text is 
long, it might be necessary to encode it as a sequence of natural numbers, 
each of which is less than n, and then each of these natural numbers would 


have to be encrypted separately. But to keep the discussion simple, we will 
assume that Alice’s message is a single natural number m < n. 

As before, we think of the message m as representing an equivalence 
class [m] , E Z/=,, and Alice and Bob will do all of their calculations using 
arithmetic in Z/=,. To encrypt her message, Alice computes [m]¢; in other 
words, she computes the unique natural number c < n such that [mJ = [ec]. 
The number c is the encrypted message, which she sends to Bob. 

To decrypt the message, Bob computes [c]. What makes the RSA 
system work is the surprising fact that [c]4 = [m],,, as we will prove below. 
Thus, by computing [c], Bob can recover the original message m. Notice 
that encryption and decryption both involve exponentiation, but the 
encryption exponent e and the decryption exponent d are different. Thus, it 
doesn’t matter if an eavesdropper learns e; as long as Bob keeps d secret, 
the eavesdropper will not know what exponent to use to decrypt the 
encrypted message. 

To show that RSA works we need to prove the following theorem. 


Theorem 7.5.1. Suppose p and q are distinct primes, n = pq, e and d are 
positive integers such that ed = 1 (mod ọ(n)), and m and c are natural 
numbers such that [m]! = [c]„. Then {cl4 = (mn. 


Proof. If e = d = 1 then [m], = [c], and the conclusion clearly holds. If not, 
then ed > 1, so since ed = 1 (mod g(n)), there is some positive integer k 
such that ed—1 = kg(n), and therefore ed = kg(n)+1 = k(p-1)(q -1)+1. And 
since [m]~ = [c],, we have m° = c (mod n), so n | (më - c). 

Although we ultimately want to draw a conclusion about arithmetic in Z/ 
=,» we will find it useful to do some calculations in Z/=, and Z/=, first. 
Since p | n and n | (m° - c), by the transitivity of the divisibility relation, p | 
(m° — c). Therefore m° = c (mod p), or equivalently [m]*, = [c]p. 

Note that the usual exponent rules work for exponentiation in Z/=,,. 
Specifically, for any X E Z/=,, and any positive integers a and b we have 

Xa. xo = X...X-X.-- Xo X---X = Xt 


ees — — 1l 


a terms b terms a+b terms 


and 


(XP =X- XXX XKX XH XM. 


w—, pom. 


a terms a terms a terms ab terms 
ow 


b groups of terms 


(For more careful proofs of these equations, see exercise 8.) Applying these 
rules, we see that 


, À k(p-1)(q—1)+1 -1,kíq— 
cl’ = ([m Ay = [m] = [mp S = (im) )*0-P . (mp. 
We claim now that [c]“ = [m],. To prove this, we consider two cases. 
p I 
Case 1. pł m. Then p and m are relatively prime, so by Euler’s theorem, 


[m],~' = [1]p. Therefore 


1 =l klg- k(q—1) 
[c] = (im]} ja D. imp = (1p? -[ra)p = [1p - Unrjp = [m]p. 


Case 2. p | m. Then [m], = [0] p» SO 


i “dl >d 
[elf = [mys = [0]%4 = [0]p = [m]p. 


In both cases we have reached the desired conclusion that [cl4, = [m]>p. 


Therefore c? = m (mod p). Similar reasoning shows that c? = m (mod q), 
and since pq = n, it follows by Lemma 7.4.5 that c = m (mod n). In other 
words, [c] = [mm], which is what we wanted to prove. O 


Let’s try this out in a simple example. Suppose Bob chooses the primes p 
= 3 and q = 11, so n = pq = 33 and ọ(n) = (p - 1)(q - 1) = 20. He also 
chooses e = 7, and he then computes lelin = [7] = [3]20. so d = 3. (As a 
check on Bob’s work, note that [7]59 : [3]o9 = [21]99 = [1]>0-) Bob sends the 
numbers n = 33 and e = 7 to Alice. 


Suppose Alice wants to send the message m = 5 to Bob. She computes 
[m]E = [5]},; = [78125]33 = [14]33, 


so her encrypted message is c = 14. She sends this number to Bob. To 
decrypt the message, Bob computes 


[c]é = [14]3, = [2744]33 = [5]s3. 


Thus, Bob successfully recovers the original message m = 5. 

Are Alice and Bob’s communications secure? Suppose an eavesdropper 
intercepts both Bob’s message to Alice and Alice’s message to Bob, thus 
learning the numbers n = 33, e = 7, and c = 14. By factoring n = 33 = 3-11, 
the eavesdropper could learn that p = 3 and q = 11 (or vice-versa), and 
therefore o(n) = (p — 1)(q - 1) = 20. But then the eavesdropper could 
compute, just as Bob did, that lelin) = [7] = [3]20, thus learning the 
decryption exponent d = 3. The eavesdropper can now decrypt Alice’s 
message just the way Bob did. The communications are not secure! 

What has gone wrong? The problem is that in this simple example we 
have used small numbers. The eavesdropper’s first step was to factor n = 
33, which is a product of two prime numbers. A small number n can be 
factored easily by simply dividing n by all smaller prime numbers until a 
prime factor is found, but if n is large then this procedure will take too long 
to be practical. Factoring numbers that are products of two large prime 
numbers is especially hard. As of 2019, the largest such number that has 
ever been factored is a product of two 116-digit primes. It was factored in 
2009 after two years of computation by many hundreds of computers 
working together on the problem, using the equivalent of almost 2000 years 
of computing by a single computer. Factoring a product of primes 
significantly larger than this would not be feasible with current computing 
technology. Today most people who use RSA choose prime numbers that 
are several hundreds of digits long. If an eavesdropper learns the numbers n 
and e, then in principle he has enough information to find the decryption 
exponent d, but the only known way to do it is to factor n. The security of 
RSA depends on the fact that, in practice, the numbers used are so large that 
factoring n is not feasible. 

But wait! What about the computations that Alice and Bob have to do 
with these extremely large numbers? Will they also be computationally 
infeasible? If so, then the system will be useless. Fortunately, there are 
efficient ways to do the computations required of Alice and Bob. While a 
detailed discussion of how these computations are performed is beyond the 
scope of this book, we can briefly comment on the main points. 


The most difficult computations Alice and Bob have to do are: 


¢ Bob must find two large prime numbers p and q. 
e Bob must find [elz 


p(n)" 
e Alice must compute [”]~, and Bob must compute [c]é. 


To find the primes p and q, Bob can simply choose suitably large 
numbers at random and test them to see if they are prime until he finds two 
primes. The problem of testing a large number to see if it is prime has been 
studied extensively. In 2019, using the best known methods, a computer can 
determine whether or not a 1000-digit number is prime in a few minutes. 
But this is not fast enough to be convenient for RSA, since Bob may have 
to test hundreds of numbers for primality before he finds a prime. So most 
implementations of RSA use probabilistic primality tests. These tests take a 
fraction of a second, but they are not guaranteed to be accurate; in 
particular, if a number is not prime, there is a chance that the test will fail to 
detect this and report that the number is prime. But by repeating the test 
several times, the probability of an error can be made as small as desired. 
For more on probabilistic primality testing, see exercises 10—14. 

We already know a method that Bob can use to compute [e]>/,): the 
extended Euclidean algorithm. This algorithm is very fast, even with very 
large numbers. For more on this, see exercise 13 in Section 7.1. 

Finally, to encrypt and decrypt messages, Alice and Bob must raise 
elements of Z/=, to high powers. Suppose X € Z/=, and a is a positive 
integer. The most straightforward way to compute X° is to multiply X by 
itself a times, but this will not be feasible if a is large. There is a better way 
using recursion. If a = 1, then of course X° = X. For larger values of a, we 
use the following formulas: 


x% = ta xf; 
y?K+I == yk i yk . y 
Example 7.5.2. Find P 
Solution 


Let X = [347]s35 E Z/=5g2; we must find X'’*. Since 172 is even, we start 
with 


x!?2 = y286 Z y86 . x6 


If we can find X®®, then we’ll just have to multiply it by itself to find X!”. 
To find X®°, we use the same method: 


x6 = LR = x8 . x3. 
Now we need to find X®, and since 43 is odd, we use the formula 
xs = y22i+l — y?! . x?! . X. 
Continuing in this way, we get the following list of formulas: 


» J > 
x!?2 — x6 . x86, 

86 _ y43 43 
xe =X. XY, 


We can now work through this list in reverse order and evaluate each 
formula: 


X? = X - X = [347]s5g - [347]5g82 = [120409]5g2 = [517]582, 

X’ = X?. X? - X = [517]582 - [517]582 - [347]582 = [92749283]582 = [17]582, 
x!0 = X5 . X’ = [17]582 - [17]582 = [289]582, 

X?! = X!0 . x!9. X = [289]sg2 - [289]sg2 - [347]sg2 = [28981787]sg2 = [515]582, 
X® = X?! . x21. X = [515]sgo - [515]5g2 - [347]sg82 = [92033075]5g2 = [251]582, 
X56 = X® . X® = [251]sg2 - [251]582 = [63001 ]5g2 = [145]582, 


X172 = x86. x86 = [145]5g2 - [145]5g2 = [21025]sg2 = [73]582- 


We conclude that [347]:}} = [73]ss2. If you count, you will find that we 


only performed 10 multiplications — much less than the 171 that would be 


required if we simply multiplied 172 Xs. For more on the number of 
multiplications required to compute X° in general, see exercise 9. 


We end this section with one more example of the use of RSA. This time 
we will use numbers that are large enough to force us to use efficient 
methods of calculation, although they are still not as large as would be used 
in areal application of RSA. 


Example 7.5.3. Suppose Bob chooses the prime numbers p = 48611 and q = 
37813. He computes n = pg = 1838127743 and g(n) = (p - 1)(q - 1) = 
1838041320. He then chooses the encryption exponent e = 184270657. 


1. Find the decryption exponent d. 
2. Suppose Alice wants to send the message m = 357249732. Find the 
encrypted message c, and verify that Bob can decrypt it. 


Solutions 


1. To compute d, Bob uses the extended Euclidean algorithm to find 
lelin) = [184270657] \<3go41320- The steps are shown in Figure 7.7. 
Bob concludes that d = 88235833. 


n dn rn Sn ty 
0 1838041320 l 0 
l 184270657 0 | 
2 9 179605407 l —9 
3 l 4665250 -l 10 
4 38 2325907 39 —389 
5 2 13436 —79 788 
6 173 1479 13706 —136713 
7 9 125 — 123433 1231205 
8 1] 104 1371469 —13679968 
9 l 21 —1494902 14911173 
10 4 20 7351077 —73324660 


11 l 1 —8845979 88235833 
12 20 0 


Figure 7.7. Computing the decryption exponent d. 


As a check, Bob can compute that 
ed — 1 = 16259274917852280 = 8845979% (n), 


so ed = 1 (mod g(n)). 

2. Let X = [m], = [357249732]1938197743- To encrypt her message, Alice 
must compute X° = X184270657. The steps are shown in Figure 7.8; of 
course, Alice plans her calculations by starting at the end of this table, 


but performs the calculations from the beginning. She sends the 
encrypted message c = 1357673396. 


k 
2 = [413387288], 44987 = [418397817], 
5 [1105456936], 89975 [1597035021], 
10 [1522283045], 179951 [1491451285], 
21 [1773257888], 359903 [954701208], 
43 [638596171], 719807 [1817497177]; 
87 [664005337], 1439614 [1774588706], 
175 [661296271], 2879229 [1061291500], 
351 [993223048]; 5758458 [21397340], 
702 [1294276724], 11516916 [1624593674], 
1405 [1088781967], 23033832 [1474914774], 
2811 [1010306117], 46067664 [1189097151], 
5623 [1064784897], 92135328 [46825442], 
11246 [1739950485], 184270657 [1357673396], 


22493 [799178524], 


Figure 7.8. Computing the encrypted message c. 


To decrypt the message, Bob lets Y = [c] „ and computes Y? = 88235833. 
as shown in Figure 7.9. As expected, he gets m = 357249732. 


k re k y* 
2 [42593275], 21541 [120530669], 
5 [1698473378], 43083 [189879402], 
Q [1210371791], 86167 [781925623], 
21 [1085519751], 172335 [1276315424], 
2 [1335983514], 344671 [1511938429], 
84 [1212154100], 689342 [1116941725], 
168 [638363154], 1378684 [748516067], 
336 [1695419879], 2757369 [590443992], 
673 [250463254], 5514739 [1169450853], 
1346 [1092090842], 11029479 [83459512], 
2692 [149835148], 22058958 [643822280], 
5385 [1009240318], 44117916 [1032113647], 
10770 [1219871219], 88235833 [357249732], 

Figure 7.9. Decrypting the message. 
Exercises 


1. Suppose Bob chooses p = 5, q = 11, and e = 7. 

(a) Find n, o(n), and d. 

(b) Suppose Alice wants to send the message m = 9. Find the encrypted 
message c, and verify that Bob can decrypt it. 

*2. Suppose Bob chooses p = 71, q = 83, and e = 1369. 


(a) Find n, o(n), and d. 

(b) Suppose Alice wants to send the message m = 1001. Find the encrypted 
message c, and verify that Bob can decrypt it. 

3. Suppose Bob chooses p = 71 and q = 83. Why would e = 1368 be a bad 
choice? 

4. Suppose Bob chooses p = 17389, q = 14947, and e = 35824631. 

(a) Findn, (n), and d. 

(b) Suppose Alice wants to send the message m = 123456789. Find the 

encrypted message c, and verify that Bob can decrypt it. 


Os: 


(a) 


(b) 


"7, 


(a) 
(b) 


(c) 
(d) 


You are eavesdropping on Alice and Bob. You intercept the message (n, 
e) = (493, 129) sent to Alice by Bob, and then the message c = 149 sent 
to Bob by Alice. 


Factor n. 
Find the decryption exponent d. 
Decrypt the message. 


Suppose Alice and Bob are using RSA. As usual, Bob has generated the 
numbers n, e, and d, and he has sent n and e to Alice but kept d secret. 
Alice has a message m that represents a contract that she wants Bob to 
sign. The contract is not secret — she is glad to send it to Bob without 
encrypting it. But she wants Bob to send back a digital signature for the 
contract. Like an ordinary signature, it should be a message that 
someone else could not forge, so that Alice knows that Bob, and not 
some impostor, has written the signature, and Bob cannot deny at a later 
date that he signed the contract. To create his signature, Bob computes 
the unique integer s such that 0 < s < n and [m] = [s]„, and he sends s 


to Alice. 


Show that [s]§ = Ural», and if s ' is any integer such that 0 < s ' < n and s 
' # s, then [s] # [m],. Thus, Alice can authenticate the signature by 
computing [s]f and verifying that it is equal to [m],. 

Why can’t an impostor forge Bob’s signature? 


In this exercise we will see why it is important for p and q to be prime. 
Suppose Bob chooses p = 9, g = 35, and e = 95, not noticing that 9 and 
35 are not prime. He computes n = pq = 315, and he sends (n, e) = (315, 
95) to Alice. 


Suppose Alice wants to send the message m = 123. What encrypted 
message c will she send? 

Bob computes = (p — 1)(q — 1) = 272; he thinks this is @(n), but he’s 
wrong. To find the decryption exponent d, he then computes 
[e];' = [d]. What value of d does he get? 

Using the decryption exponent d from part (b), what does Bob get when 
he tries to decrypt Alice’s message? 

What is the correct value of @(n)? What decryption exponent d would 
Bob have gotten if he had used the correct value for (n) and computed 


+9. 


lel, ‘a = [d]y~my? Using this decryption exponent, what would Bob 
have gotten when he tried to decrypt Alice’s message? 


Suppose m is a positive integer and X E Z/=,,. 


Give a recursive definition of X°, for positive integers a. 

Use mathematical induction to prove that for all positive integers a and 
b, ee X? = is 

Use mathematical induction to prove that for all positive integers a and 
b, (X%)? = xX. 


Suppose X € Z/=,. Prove that for every positive integer a, the recursive 


method of computing X° that is illustrated in Example 7.5.2 uses at 
most 2log, a multiplications. 


Exercises 10—14 are concerned with probabilistic primality testing. In these 
problems, we are looking for a computational test that can be performed on 
a positive integer n so that if n is prime then n passes the test, and if n is not 
prime then it fails the test. We will find that there are some tests that work 
in many cases, but not all cases. 


10. 


(a) 


According to Euler’s theorem, if n is prime and 2 <a<n- 1, thena™! 
= 1 (mod n). This suggests the following primality test: To test whether 
or not an integer n > 2 is prime, choose a random number a €E {2, 3,... 
, n — 1} and check whether or not a”! = 1 (mod n). If so, then n passes 
the test, and if not, then it fails. This test is called the Fermat primality 
test, because the instance of Euler’s theorem on which it is based is 
closely related to Fermat’s little theorem; see exercise 10 in Section 7.4. 
If n is prime, then by Euler’s theorem, it is guaranteed to pass the test. 
Unfortunately, composite numbers sometimes pass the test as well. If 2 
<a<n-1anda™!=1 (mod n), but n is not prime, then we say that n 
is a Fermat pseudoprime to the base a; it passes the Fermat primality 
test using the base a, even though it is not prime. If 2 <a<n- 1 and 
a’! Æ 1 (mod n) then we say that a is a Fermat witness for n. If there 
is a Fermat witness for n then, by Euler’s theorem, n is not prime. 


Show that 15 is a Fermat pseudoprime to the base 4, but 3 is a Fermat 
witness for 15. 


(b) 


11. 


(a) 
(b) 
(c) 
(d) 


12. 


(a) 
(b) 


Show that if n is a Fermat pseudoprime to the base a, then n and a are 
relatively prime. 

Recall from exercise 5 in Section 6.2 that the numbers F, = 2°") +] 
called Fermat numbers. Fermat showed that F, is prime for 0 < n < 4, 
and Euler showed that F; is not prime. It is not known if there is any n 
> 4 for which F, is prime. In this exercise you will show that for every 
natural number n, 2®-! = 1 (mod F,). Thus, if F, is not prime, then in 


the terminology of exercise 10, it is a Fermat pseudoprime to the base 
2. In other words, the Fermat primality test with a = 2 will not be useful 
for testing whether F, is prime. 


Show that 2(2") = —1 (mod F,). 

Show that 22""") = 1 (mod F,). 

Show that 2”*! | (F, — 1). (Hint: Use exercise 12(a) in Section 6.3.) 
Show that 2%-! = | (mod F,). (Hint: Use parts (b) and (c) and 


exercise 16 in Section 7.3.) 
Suppose n is an integer larger than 2 and let R = {2, 3,...,n- 1}. Let 


Ri = {a € R | a”! = 1 (mod n)}, 
Rə = R \ Ri = fa € R | a"~' $1 (mod n)}. 


Suppose a € R, and gcd(n, a) = 1. Then a is a Fermat witness for n, so 
n is not prime. (See exercise 10 for the meanings of the terms used in 
this exercise.) 

Show that for every x € R, there is a unique y € R, such that ax = y 
(mod n). 

By part (a), we can define a function f: R; > R, by the formula 


f(x) = the unique y € Rz such that ax = y (mod n). 


Show that f is one-to-one. 

Use part (b) to conclude that at least half of the elements of R are 
Fermat witnesses for n. (This shows that, with probability at least 1/2, 
n will fail the Fermat primality test. By repeating the test with different 


13. 


(a) 
(b) 
(c) 
(d) 
14. 


(a) 
(b) 


(c) 


choices for a, the probability of an incorrect result can be made as 
small as desired.) 


Exercise 12 shows that if there is at least one Fermat witness for n that 
is relatively prime to n, then the Fermat primality test has a good 
chance of detecting that n is not prime. Unfortunately, there are 
composite numbers n for which no such witness exists. An integer n > 2 
is called a Carmichael number if it is not prime, but it is a Fermat 
pseudoprime to the base a for every integer a € {2, 3,..., n — 1} such 
that a and n are relatively prime. They are named for Robert Daniel 
Carmichael (1879- 1967), who first studied them. If n is a Carmichael 
number, then although n is not prime, the Fermat primality test is 
unlikely to detect this fact. In 1994, W. R. Alford (1937—2003), Andrew 
Granville (1962—), and Carl Pomerance (1944—) proved that there are 
infinitely many Carmichael numbers. In this problem you will show 
that 561 is a Carmichael number. (In fact, it is the smallest Carmichael 
number.) We leave it to you to verify that 561 = 3 - 11 - 17, so 561 is not 
prime. Suppose 2 <a <n- 1 and gcd(561, a) = 1. 

Show that a°® = 1 (mod 3). 

Show that a°©? = 1 (mod 11). 

Show that a°©? = 1 (mod 17). 

Show that a°©? = 1 (mod 561). (Hint: Use exercise 14 in Section 7.4.) 
In this exercise you will work out some of the mathematical basis for 
the Miller-Rabin test, a commonly used probabilistic primality test. It is 
named for Gary L. Miller (1946—) and Michael O. Rabin (1931-). 
Suppose n is an odd integer and n > 1. 


Prove that there are positive integers s and d such that n — 1 = 2° d and 
d is odd. 

Prove that if n is prime and b is a positive integer such that b* = 1 (mod 
n), then either b = 1 (mod n) or b = -1 (mod n). 

Let s and d be as in part (a). If 2 < a<n- 1, af # 1 (mod n), and for 
all natural numbers i < s,a*“ # —1 (mod n), then a is called a 
Miller-Rabin witness for n. 

Prove that if there is a Miller-Rabin witness for n then n is not prime. 
(Hint: Suppose a is a Miller-Rabin witness for n and n is prime. Then 
by Euler’s theorem, g?*¢ = a”-! = | (mod n). Therefore we can let k 


be the smallest natural number such that a?“ = 1 (mod n). Now use 
part (b) to get a contradiction.) 

The Miller-Rabin test works as follows: To test whether or not an odd 
integer n > 1 is prime, choose a random number a €E {2, 3,...,n- 1} 
and check whether or not a is a Miller-Rabin witness for n. If it is, then 
n fails the test. If it is not, then n passes the test. By part (c), if n is 
prime then there are no Miller-Rabin witnesses, so n is guaranteed to 
pass the test. It can be proven that if n is not prime then at least 3/4 of 
the numbers a € {2, 3,...,m-— 1} are Miller-Rabin witnesses for n, so 
n will fail the test with probability at least 3/4. As in exercise 12, the 
probability of an incorrect result can be made as small as desired by 
repeating the test with different choices of a. 

(d) Show that 13 is not a Miller-Rabin witness for 85, but 14 is. 
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Infinite Sets 


8.1. Equinumerous Sets 


In this chapter, we’ll discuss a method of comparing the sizes of infinite 
sets. Surprisingly, we’ll find that, in a sense, infinity comes in different 
sizes! 

For finite sets, we determine the size of a set by counting. What does it 
mean to count the number of elements in a set? When you count the 
elements in a set A, you point to the elements of A in turn while saying the 
words one, two, and so forth. We could think of this process as defining a 
function f from the set {1, 2, ..., n} to A, for some natural number n. For 
eachi € {1, 2,..., n}, we let f(i) be the element of A you’re pointing to 
when you say “i.” Because every element of A gets pointed to exactly once, 
the function f is one-to-one and onto. Thus, counting the elements of A is 
simply a method of establishing a one-to-one correspondence between the 


set {1, 2, . . . , n} and A, for some natural number n. One-to-one 
correspondence is the key idea behind measuring the sizes of sets, and sets 
of the form {1, 2,...,n} are the standards against which we measure the 


sizes of finite sets. This suggests the following definition. 


Definition 8.1.1. Suppose A and B are sets. We’ll say that A is 
equinumerous with B if there is a function f: A > B that is one-to-one and 
onto. We’ll write A ~ B to indicate that A is equinumerous with B. For each 
natural number n, let I, = {i E Z* | i < n}. A set A is called finite if there is 


a natural number n such that I, ~ A. Otherwise, A is infinite. 


You are asked in exercise 6 to show that if A is finite, then there is exactly 
one n such that I, ~ A. Thus, it makes sense to define the number of 


elements of a finite set A to be the unique n such that I, ~ A. This number is 


also sometimes called the cardinality of A, and it is denoted |A|. Note that 
according to this definition, © is finite and |Ø| = 0. 

The definition of equinumerous can also be applied to infinite sets, with 
results that are sometimes surprising. For example, you might think that Z* 
could not be equinumerous with Z because Z includes not only all the 
positive integers, but also all the negative integers and zero. But consider 
the function f: Z* > Z defined as follows: 


n/2, if n is even, 
f(n) = i 
(1 — n)/2, ifn is odd. 


This notation means that for every positive integer n, if n is even then f(n) 
= n/2 and if n is odd then f(n) = (1 — n)/2. The table of values for f in Figure 
8.1 reveals a pattern that suggests that f might be one-to-one and onto. 


4 
2 


Figure 8.1. 


To check this more carefully, first note that for every positive integer n, if 
n is even then f(n) = n/2 > 0, and if n is odd then f(n) = (1 - n)/2 < 0. Now 
suppose n, and n, are positive integers and f(n,) = f(n»). If f(n) = f(n) > 0 
then n, and n, must both be even, so the equation f(n,) = f(n) means n, /2 = 
n, /2, and therefore n, = n. Similarly, if f(n,) = f(n) < 0 then n, and n, are 
both odd, so we get (1 — n,)/2 = (1 — n,)/2, and once again it follows that n, 
= nN». Thus, fis one-to-one. 

To see that f is onto, let m be an arbitrary integer. If m > 0 then let n = 2m, 
an even positive integer, and if m < 0 then let n = 1 — 2m, an odd positive 
integer. In both cases it is easy to verify that f(n) = m. Thus, f is onto as well 
as one-to-one, so according to Definition 8.1.1, Z* ~ Z. 

Note that the function f had to be chosen very carefully. There are many 
other functions from Z* to Z that are one-to-one but not onto, onto but not 
one-to-one, or neither one-to-one nor onto, but this does not contradict our 


claim that Z* ~ Z. According to Definition 8.1.1, to show that Z* ~ Z we 
need only show that there is at least one function from Z* to Z that is both 
one-to-one and onto, and of course to prove this it suffices to give an 
example of such a function. 

Perhaps an even more surprising example is that Z* x Z* ~ Z*. To show 
this we must come up with a one-to-one, onto function f: Z* x Z* > Z*. An 
element of the domain of this function would be an ordered pair (i, j), where 
i and j are positive integers. Exercise 12 asks you to show that the following 
formula defines a function from Z* x Z* to Z* that is one-to-one and onto: 


G+j—2)¢+j-1 
= ti 


ns 


fij) 


Once again, the table of values in Figure 8.2 may help you understand this 
example. 
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Figure 8.2. 


Theorem 8.1.2. Suppose A ~ B and C ~ D. Then: 


1.AxC~BxD. 
2. If A and C are disjoint and B and D are disjoint, then A U C ~ B U D. 


Proof. Since A ~ B and C ~ D, we can choose functions f: A > B and g: C 
—> D that are one-to-one and onto. 


1. Define h: A x C > B x D by the formula 


h(a,c) = (f(a), g(c)). 


To see that h is one-to-one, suppose h(a}, c4) = h(a», c>). This means 
that (f (a1), g(c1)) = (f (a2), g(c2)), so flay) = flag) and g(c1) = g(c3). 
Since f and g are both one-to-one, it follows that a, = a, and c4 = Co, 
SO (a, c1) = (dp, Cp). 

To see that h is onto, suppose (b, d) E B x D. Then since f and g are 
both onto, we can choose a € A and c €E C such that f(a) = b and g(c) 
= d. Therefore h(a, c) = (f (a), g(c)) = (b, d), as required. Thus h is 
one-to-one and onto, so A x C ~ Bx D. 


2. Suppose A and C are disjoint and B and D are disjoint. You are asked 
in exercise 14 to show that f U g is a one-to-one, onto function from A 
U CtoBUD,soAUC~BUD.O 


It is not hard to show that ~ is reflexive, symmetric, and transitive. In 
other words, we have the following theorem: 


Theorem 8.1.3. For any sets A, B, and C: 


1.A ~A. 
2. IfA ~ B then B ~ A. 
3. IfA ~ Band B ~ C then A ~ C. 


Proof. 


1. The identity function i, is a one-to-one, onto function from A to A. 


2. Suppose A ~ B. Then we can choose some function f: A > B that is 
one-to-one and onto. By Theorem 5.3.4, f is a function from B to A. 
But now note that (f tJ} = f, which is a function from A to B, so by 


Theorem 5.3.4 again, f ! is also one-to-one and onto. Therefore B ~ 
A. 


3. Suppose A ~ B and B ~ C. Then we can choose one-to-one, onto 
functions f: A > B and g: B > C. By Theorem 5.2.5, g ° f: A > Cis 
one-to-one and onto, so A ~ C. O 


Theorems 8.1.2 and 8.1.3 are often helpful in showing that sets are 
equinumerous. For example, we showed earlier that Z* x Z* ~ Z* and Z* 
~ Z, so by part 3 of Theorem 8.1.3 it follows that Z* x Z* ~ Z. Part 2 tells 
us that we need not distinguish between the statements “A is equinumerous 
with B” and “B is equinumerous with A,” because they are equivalent. For 
example, we already know that Z* x Z* ~ Z*, so we can also write Z* ~ 
Z* x Z*. By part 1 of Theorem 8.1.2, Z* x Z* ~ Z x Z, so we also have Z* 
= TL. 

We have now found three sets, Z, Z xZ*, and ZxZ, that are 
equinumerous with Z*. Such sets are especially important and have a 
special name. 


Definition 8.1.4. A set A is called denumerable if Z* ~ A. It is called 
countable if it is either finite or denumerable. Otherwise, it is uncountable. 


You might think of the countable sets as those sets whose elements can 
be counted by pointing to all of them, one by one, while naming positive 
integers in order. If the counting process ends at some point, then the set is 
finite; and if it never ends, then the set is denumerable. The following 
theorem gives two more ways of thinking about countable sets. 


Theorem 8.1.5. Suppose A is a set. The following statements are 
equivalent: 


1. A is countable. 


2. Either A = © or there is a function f: Z* > A that is onto. 


3. There is a function f. A > Z* that is one-to-one. 


Proof. 1 > 2. Suppose A is countable. If A is denumerable, then there is a 
function f: Z* > A that is one-to-one and onto, so clearly statement 2 is 
true. Now suppose A is finite. If A = © then there is nothing more to prove, 
so suppose A # ©. Then we can choose some element dg € A. Let g: I, > 
A be a one-to-one, onto function, where n is the number of elements of A. 
Now define f: Z* > Aas follows: 


fd = g(t), ae <n, 
ao, if: >n. 
It is easy to check now that f is onto, as required. 

2 > 3. Suppose that either A = Ø or there is an onto function from Z* to 
A. We consider these two possibilities in turn. If A = ©, then the empty set 
is a one-to-one function from A to Z*. Now suppose g: Z* > A, and g is 
onto. Then for each a €E A, the set {n E Z* | g(n) = a} is not empty, so by 
the well-ordering principle it must have a smallest element. Thus, we can 
define a function f: A > Z* by the formula 


f(a) = the smallest n € Z* such that e(n) = a. 


Note that for each a € A, g(f (a)) = a, so g ° f = iy. But then by Theorem 
5.3.3, it follows that fis one-to-one, as required. 

3 > 1. Suppose g: A > Z* and g is one-to-one. Let B = Ran(g) S Z*. 
Then g maps onto B. This means that if we think of g as a function from A 
to B, then it is one-to-one and onto, so A ~ B. Thus, it suffices to show that 
B is countable, since by Theorem 8.1.3 it follows from this that A is also 
countable. 


Suppose B is not finite. We must show that B is denumerable, which we 
can do by defining a one-to-one, onto function f: Z* > B. The idea behind 
the definition is simply to let f(n) be the nth element of B, for each n E Z*. 
(Recall that B S Z*, so we can use the ordering of the positive integers to 


make sense of the idea of the nth element of B.) For a more careful 
definition of f and the proof that f is one-to-one and onto, see exercise 15. O 


If A is countable and A # ©, then by Theorem 8.1.5 there is a function f: 
Z* > A that is onto. If, for every n E Z*, we let a, = f(n), then the fact that 
fis onto means that every element of A appears at least once in the list a4, 
a>, A3, . . . . In other words, A = {a}, dy, a}, . . .}. Countability of a set A is 


often used in this way to enable us to write the elements of A in a list, 
indexed by the positive integers. In fact, you might want to think of 
countability for nonempty sets as meaning listability. Of course, if A is 


denumerable, then the function f can be taken to be one-to-one, which 
means that each element of A will appear only once in the list a4, a>, a3, .. . 


. For an example of an application of countability in which the elements of 
a countable set are written in a list, see exercise 19. 


Theorem 8.1.5 is also sometimes useful for proving that a set is 
denumerable, as the proof of our next theorem shows. 


Theorem 8.1.6. Q is denumerable. 
Proof. Let f: Z x Z > Q be defined as follows: 
f(P.q) = p/4. 


Clearly f is onto, since by definition all rational numbers can be written as 
fractions, but note that f is not one-to-one. For example, f(1, 2) = f(2, 4) = 
1/2. Since Z* ~ Z, by Theorem 8.1.2 we have Z* x Z* ~ Z x Z*, and since 
we already know that Z* x Z* is denumerable, it follows that Z x Z* is also 
denumerable. Thus, we can choose a one-to-one, onto function g: Z* > Z x 
Z*. By Theorem 5.2.5, f ° g: Z* > Q is onto, so by Theorem 8.1.5, Q is 
countable. Clearly Q is not finite, so it must be denumerable. O 


Although our focus in this chapter is on infinite sets, the methods in this 
section can be used to prove theorems that are useful for computing the 
cardinalities of finite sets. We end this section with one example of such a 
theorem, and give several other examples in the exercises (see exercises 
20-30). 


Theorem 8.1.7. Suppose A and B are disjoint finite sets. Then A U B is 
finite, and |A U B| = |A| +|B]. 

Proof. Let n = |A| and m = |B|. Then A ~ I, and B ~ I,,. Notice that if x E 
I„ then 1 < x < m, and thereforen+1<x+n<n+m,sox+n€lI,,, \ I. 
Thus we can define a function f: In > In+m \ I, by the formula f(x) = x + n. 
\ Jp Since B ~ 
Im it follows that B ~ I +m \ In. Applying part 2 of Theorem 8.1.2, we can 


It is easy to check that f is one-to-one and onto, so Ip ~ Ip+m 


conclude that A U B ~ I, U (Irm \ In) = Intm. Therefore A U B is finite, and 
|A U B| =n + m= |A| +|B|. 5 


Exercises 


ad 


Show that the following sets are denumerable. 

N. 

The set of all even integers. 

Show that the following sets are denumerable: 

Q xQ. 

Q(/2). (See exercise 21(b) of Section 5.4 for the meaning of the 
notation used here.) 


In this problem we’ll use the following notation for intervals of real 
numbers. If a and b are real numbers and a < b, then 


[a,b] = {xE Rl a <x <b}, 
(a,b) = {x Ee R| a <x <b}, 
(a,b])={x Ee Rl|]a<x <b}, 


< 
la,b) = {x € Rl] a <x < db}. 


Show that [0, 1] ~ [0, 2]. 

Show that (—71/2, 7/2) ~ R. (Hint: Use a trigonometric function.) 

Show that (0, 1) ~ R. 

Show that (0, 1] ~ (0, 1). 

Justify your answer to each question with either a proof or a 
counterexample. 


Suppose A ~ Band A x C ~ B x D. Must it be the case that C ~ D? 
Suppose A ~ B, A and C are disjoint, B and D are disjoint, and A U C 
~ BU D. Must it be the case that C ~ D? 

Prove that if A ~ B then A(A) ~ A(B). 

(a) Prove that for all natural numbers n and m, if I, ~ Ip then n = m. 
(Hint: Use induction on n.) 


(b) 


12. 


13. 


14. 


Prove that if A is finite, then there is exactly one natural number n such 
that I, ~ A. 


Suppose A and B are sets and A is finite. Prove that A ~ B iff B is also 

finite and |A| = |B]. 

(a) Prove that if n E N and A E I, then A is finite and |A| < n. 
Furthermore, if A # I„ then |A| < n. 


Prove that if A is finite and B & A, then B is also finite, and |B| < JA]. 
Furthermore, if B # A, then |B| < JAI. 


Suppose B € A, B #A, and B ~ A. Prove that A is infinite. 
Prove that if n € N, f: I, > B, and fis onto, then B is finite and |B| < n. 


Suppose A and B are finite sets and f: A > B. 


Prove that if |A| < |B| then f is not onto. 

Prove that if |A| > |B| then f is not one-to-one. (This is sometimes called 
the pigeonhole principle, because it means that if n items are put into m 
pigeonholes, where n > m, then some pigeonhole must contain more 
than one item.) 

Prove that if |A| = |B| then fis one-to-one iff f is onto. 


Show that the function f: Z* x Z* > Z* defined by the formula 


(i +j-2Xi+j-1) 
9 Ies 


fli, j) = 


is one-to-one and onto. 


In this exercise you will give another proof that Z* x Z* ~ Z*. Let f: 
Z x Z* > Z* be defined by the formula 


f(m,n) = 2"! (2n — 1). 


Prove that f is one-to-one and onto. 

Complete the proof of part 2 of Theorem 8.1.2 by showing that if f: A 
> Bandg: C > Dare one-to-one, onto functions, A and C are disjoint, 
and B and D are disjoint, then f U g is a one-to-one, onto function from 
AUCtoBUD. 


15. In this exercise you will complete the proof of 3 > 1 of Theorem 8.1.5. 
Suppose B € Z* and B is infinite. We now define a function f. Z* > B 
by recursion as follows: 

For alln E€ Z*, 


f(n) = the smallest element of B \ {f(m) | m E Z*, m < n}. 


Of course, the definition is recursive because the specification of f(n) 
refers to f(m) for all m < n. 


(a) Suppose n € Z*. The definition of f(n) only makes sense if we can be 
sure that B \ {f(m) | m E Z*, m < n} # ©, in which case the well- 


ordering principle guarantees that it has a smallest element. Prove that 
B \ {f(m) | m € Z*, m < n} # Ø. (Hint: See exercises 8 and 10.) 


(b) Prove that for all n E Z’, f(n) =n. 

(c) Prove that fis one-to-one and onto. 

16. In this exercise you will give an alternative proof of Theorem 8.1.6. 
(a) Find a function f. Z* > Z \ {0} that is one-to-one and onto. 

(b) Let g: Z* > Q* be defined as follows. Suppose n E Z* and the prime 


factorization of n is n = p\' p3 --- py‘, where p}, Po, - - - , Py are prime 
numbers, pı < pp <* °° < py, and e], €>, . . . , €g are positive integers. 
Then we let 


1 „e2 . fei) flex) f (ex) 
g(n) = g(p\' py ++: Py) = pi ej p? e? + Dy ex i 


where f is the function from part (a). (As in Section 7.2, we consider 
the empty product to be 1, so that g(1) = 1.) Prove that g is one-to-one 
and onto. (Hint: You will find exercise 19 in Section 7.2 useful.) 

(c) Use g to define a one-to-one, onto function h: Z > Q, and conclude 


that Q is denumerable. 


17. Prove that if B © A and A is countable, then B is countable. 
18. Prove that if B & A, A is infinite, and B is finite, then A \ B is infinite. 


19. 


20. 


21. 


¥*22. 


(b) 


23. 


(a) 
(b) 


(c) 
(d) 


Suppose A is denumerable and R is a partial order on A. Prove that R 
can be extended to a total order on A. In other words, prove that there is 
a total order T on A such that R S T. Note that we proved a similar 
theorem for finite A in Example 6.2.2. (Hint: Since A is denumerable, 
we can write the elements of A in a list: A = {a}, do, dz, . . .}. Now, 
using exercise 2 of Section 6.2, recursively define partial orders R,, for 
n EN, so that R= Rọ S R, S R, S + + + and Vi E I, Vj E Z* ((a;, a) E 
R, V (qj, a;) E Rp). Let T = Unen Rn. 


neN 

Suppose A is finite and B € A. By exercise 8, B and A \ B are both 
finite. Prove that |A \ B| = |A| — |B|. (In particular, if a E A then JA \ {a}| 
= |A| -1. We used this fact in several proofs in Chapter 6; for example, 
we used it in Examples 6.2.1 and 6.2.2.) 


Suppose n is a positive integer and for each i E I,, A; is a finite set. 
Also, assume that Vi € I, Vj E I, (i *ž j > Aj n Aj = Ø). Prove that 
Uje,, Ai is finite and | Uj<,, Ail = Xi- |Ail- 

(a) Prove that if A and B are finite sets, then A x B is finite and |A x 
B| = |A| - |B|. (Hint: Use induction on |B|. In other words, prove the 
following statement by induction: Vn E NVAVB(if A and B are 
finite and |B| = n, then A x B is finite and |A x B| = |A| - n). You 
may find Theorem 4.1.3 useful.) 

A meal at Alice’s Restaurant consists of an entree and a dessert. The 
entree can be either steak, chicken, pork chops, shrimp, or spaghetti, 
and dessert can be either ice cream, cake, or pie. How many different 
meals can you order at Alice’s Restaurant? 


For any sets A and B, the set of all functions from A to B is denoted “ B. 


Prove that if A ~ B and C ~ Dthen“C ~ Ë D. 

Prove that if A, B, and C are sets and A n B = Ø, then YB C ~4C x 8 
G. 

Prove that if A and B are finite sets, then 4 B is finite and |4 B| = |B||41. 
(Hint: Use induction on |A|.) 

A professor has 20 students in his class, and he has to assign a grade of 
either A, B, C, D, or F to each student. In how many ways can the 
grades be assigned? 


(b) 


27. 


28. 
29. 


Suppose |A| = n, and let F = {f | f is a one-to-one, onto function from I, 
to A}. 

Prove that F is finite, and |F | = n!. (Hint: Use induction on n.) 

Let L = {R | R is a total order on A}. Prove that F ~ L, and therefore |L| 
=n!. 

Five people are to sit in a row of five seats. In how many ways can they 
be seated? 

Suppose A is a finite set and R is an equivalence relation on A. 
Suppose also that there is some positive integer n such that Vx € A(| 
[xl | = n). Prove that A/R is finite and |A/R| = |A|/n. (Hint: Use exercise 
21.) 

(a) Suppose that A and B are finite sets. Prove that A U B is finite, and 
|A U B| = |A| +|B| -|A ^ B]. 

Suppose that A, B, and C are finite sets. Prove that A U B U C is finite, 

and 


|AUBUC| = |A]+|B]+/C|—|ANB|—|ANC|—|BNC|+|ANBNC]. 


In this problem you will prove the inclusion-exclusion principle, which 
generalizes the formulas in exercise 26. Suppose A}, A>, ..., A, are 


finite sets. Let P = AI) Ø}, and for each S € P let As = f )jcs Ai. 
Prove that Uj<;, Ai is finite and 


| © Aj| = Yo(-D SHAS. 


i€l, SeP 


(The notation on the right-hand side of this equation denotes the result 
of running through all sets S E P, computing the number (-1)!/S!*1 |As | 
for each S, and then adding these numbers. Hint: Use induction on n.) 


Prove that if A and B are finite sets and |A| = |B|, then |A A B| is even. 


Each customer in a certain bank has a PIN number, which is a 
sequence of four digits. Show that if the bank has more than 10,000 
customers, then some two customers must have the same PIN number. 
(Hint: See exercise 11.) 


30. Alice opened her grade report and exclaimed, “I can’t believe Prof. 
Jones flunked me in Probability.” “You were in that course?” said Bob. 
“That’s funny, I was in it too, and I don’t remember ever seeing you 
there.” “Well,” admitted Alice sheepishly, “I guess I did skip class a 
lot.” “Yeah, me too” said Bob. Prove that either Alice or Bob missed at 
least half of the classes. 


8.2. Countable and Uncountable Sets 


Often when we perform some set-theoretic operation with countable sets, 
the result is again a countable set. 


Theorem 8.2.1. Suppose A and B are countable sets. Then: 

1. A x B is countable. 

2. A U B is countable. 
Proof. Since A and B are countable, by Theorem 8.1.5 we can choose one- 
to-one functions f: A > Z* and g: B > Z*. 


1. Define h: A x B > Z* x Z* by the formula 


h(a, b) = (f(a), g(b)). 


As in the proof of part 1 of Theorem 8.1.2, it is not hard to show that 
h is one-to-one. Since Z* xZ* is denumerable, we can let j: Z+ xZ* 


> Z" be a one-to-one, onto function. Then by Theorem 5.2.5, j ch: 


AxB > Z* is one-to-one, so by Theorem 8.1.5, A x B is countable. 


2. Define h: AUB > Z as follows: 


f(x), ifx eA, 
h(x) = 
—g(x), ifx ZA. 
We claim now that h is one-to-one. To see why, suppose that h(x,) = 
h(x»), for some x; and x, in A U B. If h(x,) = h(x) > 0, then according 


to the definition of h, we must have x, E A, x € A, and f(x,) = hA(x,) 
= h(x) = f(x»). But then since f is one-to-one, x, = x. Similarly, if 
h(x) = A(X) < 0, then we must have g(x,) = —h(x,) = —h(X>) = g(x), 
and then since g is one-to-one, x4 = X2. Thus, h is one-to-one. 


Since Z is denumerable, we can let j: Z > Z* be a one-to-one, onto 
function. As in part 1, we then find that j ° h: A U B > Z* is one-to- 
one, so A U B is countable. O 


As our next theorem shows, part 2 of Theorem 8.2.1 can be extended to 
unions of more than two sets. 


Theorem 8.2.2. The union of countably many countable sets is countable. 
In other words, if Fis a family of sets, F is countable, and also every 


element of F is countable, then |) F is countable. 


Proof. We will assume first that © /€ At the end of the proof we will 
discuss the case © E Z 


If F= ©, then of course |) F = Ø, which is countable. Now suppose 74 
©. Then, as described after the proof of Theorem 8.1.5, since F is 
countable and nonempty we can write the elements of 7 in a list, indexed 
by the positive integers. In other words, we can say that Z= {Aj, Ap, As, . . 
.}. Similarly, every element of 7 is countable and nonempty (since © /€ 
Fj, so for each positive integer i the elements of A; can be written in a list. 
Thus we can write 

A, = {a},a},a},...}, 


A2 = {a?,a3,az, neis 


and, in general, 


Aj = {ai ab, a}, AT 


Note that, by the definition of union, |J F = {a' }ieZt,j eZ}. 


Now define a function f : Z+ x Z*+ + (JF by the formula 
fi, j) =a’. 


Clearly f is onto. Since Z* x Z* is denumerable, we can let g: Z* > Z* x 
Z* be a one-to-one, onto function. Then fog: Z* > (JF is onto, so (JF 
is countable. 

Finally, suppose © E F Let 7’ = F\ {©}. Then #" is also a countable 


family of countable sets and © /€ #", so by the earlier reasoning, | JF’ is 
countable. But clearly |J F = |) F’, so |J F is countable too. O 


Another operation that preserves countability is the formation of finite 
sequences. Suppose A is a set and ay, a>, . . . , a, is a list of elements of A. 


We might specify the terms in this list with a function f: I, > A, where for 
each i, f(i) = a; = the ith term in the list. Such a function is called a finite 
sequence of elements of A. 


Definition 8.2.3. Suppose A is a set. A function f: I, > A, where n is a 


natural number, is called a finite sequence of elements of A, and n is called 
the length of the sequence. 


Theorem 8.2.4. Suppose A is a countable set. Then the set of all finite 
sequences of elements of A is also countable. 


Proof. For each n E N, let S„ be the set of all sequences of length n of 
elements of A. We first show that for every n E N, S, is countable. We 
proceed by induction on n. 

In the base case we assume n = 0. Note that I) = ©, so a sequence of 
length 0 is a function f: © > A, and the only such function is Ø. Thus, Sọ = 
{©}, which is clearly a countable set. 

For the induction step, suppose n is a natural number and S,, is countable. 
We must show that S,,, is countable. Consider the function F: S, x A > 


S„+1 defined as follows: 
F(f,a) = fU{(n+ l,a)}. 


In other words, for any sequence f E S, and any element a E A, F(f, a) is 
the sequence you get by starting with f, which is a sequence of length n, and 
then tacking on a as term number n + 1. You are asked in exercise 2 to 
verify that F is one-to-one and onto. Thus, S, x A ~ S,,,. But S, and A are 
both countable, so by Theorem 8.2.1, S, x A is countable, and therefore Sp+1 
is countable. 

This completes the inductive proof that for every n € N, S, is countable. 
Finally, note that the set of all finite sequences of elements of A is (pen Sn, 
and this is countable by Theorem 8.2.2. 0 


As an example of the use of Theorem 8.2.4, you should be able to show 
that the set of all grammatical sentences of English is a denumerable set. 
(See exercise 17.) 

By now you may be wondering if perhaps all sets are countable! Is there 
any set-theoretic operation that can be used to produce uncountable sets? 
We’ll see in our next theorem that the answer is yes, the power set 
operation. This fact was discovered by the German mathematician Georg 
Cantor (1845-1918) by means of a famous and ingenious proof. In fact, it 
was Cantor who first conceived of the idea of comparing the sizes of 
infinite sets. Cantor’s proof is somewhat harder than the previous proofs in 
this chapter, so we’ll discuss the strategy behind the proof before presenting 
the proof itself. 


Theorem 8.2.5. (Cantor’s theorem) A(Z") is uncountable. 


Scratch work 


The proof is based on statement 2 of Theorem 8.1.5. We’ll show that there 
is no function f: Zt > A(Z*) that is onto. Clearly A(Z*) # ©, so by 
Theorem 8.1.5 this shows that (Z") is not countable. 

Our strategy will be to let f. Zt > A(Z"*) be an arbitrary function and 
prove that f is not onto. Reexpressing this negative goal as a positive 


statement, we must show that JD[D E A(Z*) AVn E Z* (D Z f(n))]. This 
suggests that we should try to find a particular set D for which we can prove 
both D E A(Z*) and Vn E Z* (D  f(n)). This is the most difficult step in 


figuring out the proof. There is a set D that makes the proof work, but it 
will take some cleverness to come up with it. 


We want to make sure that D E ZZ”), or in other words D © Z*, so we 


know that we need only consider positive integers when deciding what the 
elements of D should be. But this still leaves us infinitely many decisions to 
make: for each positive integer n, we must decide whether or not we want n 
to be an element of D. We also need to make sure that Vn E Z* (D Z f(n)). 


This imposes infinitely many restrictions on our choice of D: for each 
positive integer n, we must make sure that D # f(n). Why not make each of 
our infinitely many decisions in such a way that it guarantees that the 
corresponding restriction is satisfied? In other words, for each positive 
integer n, we’ll make our decision about whether or not n is an element of 
D in such a way that it will guarantee that D # f(n). This isn’t hard to do. 
We can let n be an element of D if n/€ f(n), and leave n out of D if n E 
f(n). This will guarantee that D ~ f(n), because one of these sets will contain 
nas an element and the other won’t. This suggests that we should let D = {n 
E Z* |n/E€ f(n)}. 

Figure 8.3 may help you understand the definition of the set D. For each 
m E Z*, f(m) is a subset of Z*, and it can be specified by saying, for each 


positive integer n, whether or not n € f(m). The answers to these questions 
can be arranged in a table as shown in Figure 8.3. Each row of the table 
gives the answers needed to specify the set f(m) for a particular value of m. 
The set D can also be specified with a row of yesses and noes, as shown at 
the bottom of Figure 8.3. For each n € Z* we’ve decided to determine 
whether or not n € D by asking whether or not n € f(n), and the answers to 
these questions are the ones surrounded by boxes in Figure 8.3. Because n 
E D iff n/€ f(n), the row of yesses and noes that specifies D can be found 
by reading the boxed answers along the diagonal of Figure 8.3 and 
reversing all the answers. This is guaranteed to be different from every row 
of the table in Figure 8.3, because for each n € Z” it differs from row n in 


the nth position. 


If you found this reasoning difficult to follow, don’t worry about it. 
Remember, the reasoning used in choosing the set D won’t be part of the 
proof anyway! After you finish reading the proof, you can go back and try 
reading the last two paragraphs again. 


Isn € f(m)? 


| 

2 

3 

4 | yes yes no yes no 
n P ag 


Isn € D? no no yes no yes 


m 


Figure 8.3. 


It should be clear that the set D we have chosen is a subset of Z*, so D € 
A(Z*). Our other goal is to prove that Vn € Z* (D # f(n)), so we let n be 


an arbitrary positive integer and prove D = f(n). Now recall that we chose D 
carefully so that we would be able to prove D # f(n), and the reasoning 
behind this choice hinged on whether or not n € f(n). Perhaps the easiest 
way to write the proof is to consider the two cases n E f(n) and n /€ f(n) 
separately. In each case, applying the definition of D easily leads to the 
conclusion that D # f(n). 


Proof. Suppose f. Zt > A(Z*). We will show that f cannot be onto by 
finding a set D E AZ’) such that D /€ Ran(f). Let D = {n E Z* | n/E 
f(n)}. Clearly D € Z*, so D E AZ"). Now let n be an arbitrary positive 
integer. We consider two cases. 

Case 1. n € f(n). Since D = {n € Z* | n/€ f(n)}, we can conclude that n 
/E D. But then since n € f(n) and n/€ D, it follows that D # f(n). 


Case 2.n/€ f(n). Then by the definition of D, n E D. Since n E D and n 
/E f(n), D # f(n). 

Since these cases are exhaustive, this shows that Vn € Z* (D # f(n)), so 
D /E Ran(f). Since f was arbitrary, this shows that there is no onto function 
fi Z > AZ"). Clearly A(Z*) # ©, so by Theorem 8.1.5, A(Z*) is 
uncountable. O 


The method used in the proof of Theorem 8.2.5 is called diagonalization 
because of the diagonal arrangement of the boxed answers in Figure 8.3. 
Diagonalization is a powerful technique that can be used to prove many 
theorems, including our next theorem. However, rather than doing another 
diagonalization argument, we’ll simply apply Theorem 8.2.5 to prove the 
next theorem. 


Theorem 8.2.6. IR is uncountable. 


Proof. We will define a function f. A(Z*) > R and show that f is one-to- 
one. If R were countable, then there would be a one-to-one function g: R > 
Z*. But then g ° f would be a one-to-one function from A(Z*) to Z* and 
therefore A(Z"*) would be countable, contradicting Cantor’s theorem. Thus, 
this will show that IR is uncountable. 

To define f, suppose A E A(Z*). Then f(A) will be a real number 


between 0 and 1 that we will specify by giving its decimal expansion. For 
each positive integer n, the nth digit of f(A) will be the number d, defined as 


follows: 


3, ifn éA, 
dn = 
7, ifme A, 
In other words, in decimal notation we have f(A) = 0.d, d) d3. . .. For 


example, if E is the set of all positive even integers, then f(E) = 0.37373737 
.... If P is the set of all prime numbers, then f(P) = 0.37737373337.... 


To see that f is one-to-one, suppose that A E A(Z"), B E AZ’), and A 
# B. Then there is some n € Z* such that either n E A andn/€ B, orn € 


B and n/€ A. But then f(A) and f(B) cannot be equal, since their decimal 
expansions differ in the nth digit. Thus, f is one-to-one. O 


Exercises 


*1, 


(a) Prove that the set of all irrational numbers, R \ Q, is uncountable. 


(b) Prove that R\ Q~ R. 


2. 


(d) 


Let F: S, x A > S,,, be the function defined in the proof of Theorem 
8.2.4. Show that F is one-to-one and onto. 

In this exercise you will give an alternative proof of Theorem 8.2.4. 
Suppose A is a countable set, and let S be the set of all finite sequences 
of elements of A. Since A is countable, there is a one-to-one function g: 
A > Z*. For each positive integer n, let p„ be the nth prime number; 
thus, p4 = 2, p = 3, and so on. Define F: S > Z* as follows: Suppose f 
€ S and the length of fis n. Then 


ae e(fQl)) 2(f(2)) e(f(n)) 
F(f) =p; Pa “++ Pn . 


Show that F is one-to-one, and therefore S is countable. 


Let P = {X E AZ") | X is finite}. Prove that P is denumerable. 


Prove the following more general form of Cantor’s theorem: For any 
set A, A ~ A(A), (Hint: Imitate the proof of Theorem 8.2.5.) 


For the meaning of the notation used in this exercise, see exercise 23 of 
Section 8.1. 


Prove that for any sets A, B, and C, ^ (B x C) ~4Bx4C. 

Prove that for any sets A, B, and C,4*®) C ~ 4 (8 C). 

Prove that for any set A, A(A) ~ 4 {yes, no}. (Note that if A is finite 
and |A| = n then, by exercise 23(c) of Section 8.1, it follows that |(A)| 
= |{yes, no}|4l = 2”. Of course, you already proved this, by a different 
method, in exercise 11 of Section 6.2.) 


Prove that FAZ) ~ PZ»). 


*8, 


(b) 


10. 


11. 


12. 


(a) 
(b) 
(c) 


13. 


14. 


Suppose A is denumerable. Prove that there is a partition P of A such 
that P is denumerable and for every X € P, X is denumerable. 
Prove that if A and B are disjoint sets, then A(AUB) ~ A(A)x A(B). 


(a) Suppose A, S A, S A; S ++ and |) An = R. Prove that for 
every uncountable set B © R there is some positive integer n such 


neZ* 


that B n A, is uncountable. 
Suppose A; © A, © A} © + + + and U,,-¢+ An = Z*. Suppose also that 
for every infinite set B S Z* there is some positive integer n such that 
B ^ A,is infinite. Prove that for some n, A= Z”. 
Suppose A S R*, b E R*, and for every list a4, do, . . . , a, of finitely 
many distinct elements of A, a, + ad, +--+: +a, < b. Prove that A is 
countable. (Hint: For each positive integer n, let A, = {x E A | x= 1/n}. 
What can you say about the number of elements in A,?) 
Suppose E is an equivalence relation on R and for all real numbers x 
and y, [xX]; ~ [yl]. Prove that either R/E is uncountable or for every x 
€E R, [x]Fis uncountable. 
A real number x is called algebraic if there is a positive integer n and 
integers do, a4, ..., a, such that dg + a, x + ay xX? +--+ +a, x” = 0 anda, 
# 0. Let A be the set of all algebraic numbers. 
Prove that QS A. 


Prove that \/2 e A. 
Prove that A is countable. Note: You may use the fact that if n is a 
positive integer, dp, d,,..., Ap are integers, and a, # 0, then {x E R | 


dy + a, x + a, x°? +: ++ +a, x" = 0} is finite. 
Suppose ZFS {f|f: Z* > R} and Fis countable. Prove that there is a 
function g: Z* > R such that FS O(g). (See exercise 19 of Section 5.1 


for the meaning of the notation used here.) 
Suppose F © AZ") and F is pairwise disjoint. Prove that Z is 


countable. 


*15. If A and B are infinite sets, we say that A and B are almost disjoint if A 


(a) 


16. 


17. 


18. 


(a) 


n B is finite. If Zis a family of infinite sets, then we say that 7 is 
pairwise almost disjoint if for all A and B in £7 if A # B then A and B 
are almost disjoint. In this exercise you will prove that there is some Z 
S YA(Z*) such that all elements of “are infinite, Fis pairwise almost 
disjoint, and Z is uncountable. (Contrast this with the previous 


exercise.) 

Let P = {X E AZ") | X is finite} and Q = {X E AZ") | X is 
infinite}. By exercise 4, P is denumerable, so we can choose a one-to- 
one, onto function g: P > Z*. 


Prove that Q is uncountable. For each A € Q, let S, = {AnI |n E Z* 
}. For example, if A is the set of all prime numbers, then S4 = {©, {2}, 
{2, 3}, {2, 3, 5}, . . .}. (We might describe S, as the set of all initial 
segments of A.) 

Prove that if A E Q then S, © P and S, is infinite. 

Prove that if A, B E Q and A # B then S4 N Sz is finite. 

Let F= {g(S,) | A E Q}. Prove that FE PAZ”), every element of 7 is 


infinite, Fis pairwise almost disjoint, and 7 is uncountable. 


Prove that there is a function f. Z* > Z* such that for all positive 


integers a, b, and c there is some positive integer n such that f(an+ b) = 
C. 

Prove that the set of all grammatical sentences of English is 
denumerable. (Hint: Every grammatical sentence of English is a finite 
sequence of English words. First show that the set of all grammatical 
sentences is countable, and then show that it is infinite.) 

Some real numbers can be defined by a phrase in the English language. 
For example, the phrase “the ratio of the circumference of a circle to its 
diameter” defines the number r. 


Prove that the set of numbers that can be defined by an English phrase 
is denumerable. (Hint: See exercise 17.) 


(b) Prove that there are real numbers that cannot be defined by English 
phrases. 


8.3. The Cantor-Schroder-Bernstein Theorem 


Suppose A and B are sets and f is a one-to-one function from A to B. Then f 
shows that A ~ Ran(f) © B, so it is natural to think of B as being at least as 
large as A. This suggests the following notation: 


Definition 8.3.1. If A and B are sets, then we will say that B dominates A, 
and write A < B, if there is a function f: A > B that is one-to-one. If A < B 
and A + B, then we say that B strictly dominates A, and write A < B. 


For example, in the proof of Theorem 8.2.6 we gave a one-to-one 
function f: A(Z*) > R, so A(Z”) < R. Of course, for any sets A and B, if 
A ~ B then also A < B. It should also be clear that if A © B then A S B. 
For example, Z* < R. In fact, by Theorem 8.2.6 we also know that Z* + R, 
so we can say that Z* < R. 


You might think that S would be a partial order, but it turns out that it 
isn’t. You’re asked in exercise 1 to check that S is reflexive and transitive, 
but it is not antisymmetric. (In the terminology of exercise 25 of Section 
4.5, S is a preorder.) For example, Z* ~ Q, so Z* S Q and Q KS Z*, but of 
course Z* # Q. But this suggests an interesting question: If A < B and B S 
A, then A and B might not be equal, but must they be equinumerous? 

The answer, it turns out, is yes, as we’ll prove in our next theorem. 
Several mathematicians’ names are usually associated with this theorem. 
Cantor was the first person to state the theorem, and he gave a partial proof. 
Later, Ernst Schröder (1841-1902) and Felix Bernstein (1878—1956) 
discovered proofs independently. 


Theorem 8.3.2. (Cantor-Schréder-Bernstein theorem) Suppose A and B are 
sets. If A < Band B SA, then A ~ B. 


Scratch work 


We start by assuming that A < B and B S A, which means that we can 
choose one-to-one functions f: A > B and g: B > A. To prove that A ~ B 
we need to find a one-to-one, onto function h: A > B. 

At this point, we don’t know much about A and B. The only tools we 
have to help us match up the elements of A and B are the functions f and g. 
If fis onto, then of course we can let h = f; and if g is onto, then we can let h 
= g`}. But it may turn out that neither f nor g is onto. How can we come up 
with the required function h in this case? 

Our solution will be to combine parts of f and g™! to get h. To do this, 
we’ll split A into two pieces X and Y, and B into two pieces W and Z, in 
such a way that X and W can be matched up by f, and Y and Z can be 
matched up by g. More precisely, we’ll have W = f(X) = {f(x) | x E X} and Y 
= g(Z) = {g(z) | z © Z}. The situation is illustrated in Figure 8.4. Once we 
have this, we’ll be able to define h by letting h(a) = f(a) for a E X, and h(a) 
=g‘(a)fora€ Y. 


A B 


§ 


Figure 8.4. 


How can we choose the sets X, Y, W, and Z? First of all, note that every 
element of Y must be in Ran(g), so any element of A that is not in Ran(g) 
must be in X. In other words, if we let A, = A \ Ran(g), then we must have 


A, & X. But now consider any a E A}. We know that we must have a €E X, 
and therefore f(a) © W. But now note that since g is one-to-one, g(f (a)) 


will be different from g(z) for every z E Z, and therefore g(f (a)) /€ g(Z) = 
Y. Thus, we must have g(f (a)) E X. Since a was an arbitrary element of A4, 


this shows that if we let A, = g(f (A,)) = {g(f (a)) | a E A, }, then we must 
have A, © X. Similarly, if we let A, = g(f (A>)), then it will turn out that we 
must have A, © X. Continuing in this way we can define sets A, for every 
positive integer n, and for every n we must have A, © X. As you will see, 


letting X = | ez+ An works. In the following proof, we actually do not 
mention the sets W and Z. 


Proof. Suppose A S B and B S A. Then we can choose one-to-one 
functions f: A > B and g: B > A. Let R= Ran(g) © A. Then g maps onto R, 
so by Theorem 5.3.4, gt: R > B. 


We now define a sequence of sets A4, A>, Az, . . . by recursion as follows: 


A,; =A \ R; 
for every n € Zr, An+i = @(f(An)) = (9(f(a)) | a € An}. 


Let X = |] „ez+ An and Y = A \ X. Of course, every element of A is in either 
X or Y, but not both. Now define h: A > B as follows: 


f(a), ifa € X, 


h(a) = 
g l(a), ifaey. 


Note that for every a € A, if a /E R then a € A, € X. Thus, if a E Y then a 
€ R, so g ! (a) is defined. Therefore this definition makes sense. 


We will show that h is one-to-one and onto, which will establish that A ~ 
B. To see that h is one-to-one, suppose a, E A, a, E A, and h(a,) = h(a). 


Case 1. a; © X. Suppose a, € Y. Then according to the definition of h, 
h(a,) = f(a,) and h(a») = g™ (a>). Thus, the equation h(a,) = h(a) means 
flay) = g” (aa), so g(f (a1) = g(g* (ap)) = ap. Since ay € X = Upez+ Ans 
we can choose some n E Z* such that a, E A,. But then a, = g(f (a,)) © 
g(f (A,)) = An+1, SO a> E X, contradicting our assumption that a, E Y. 


Thus, a, /E Y, so ay E X. This means that h(a) = f(a»), so from the 
equation h(a,) = h(a) we get f(a,) = f(a»). But f is one-to-one, so it follows 
that a; = dp. 

Case 2. a, © Y. As in case 1, if ay © X, then we can derive a 
contradiction, so we must have a, E Y. Thus, the equation h(a,) = h(a) 
means g`! (a) = g™* (a). Therefore, a, = g(g' (a1)) = 9(g (aa) = a. 

In both cases we have a, = a,, so h is one-to-one. 

To see that h is onto, suppose b €E B. Then g(b) E A, so either g(b) E X 
or g(b) € Y. 

Case 1. g(b) E X. Choose n such that g(b) E A,. Note that g(b) E Ran(g) 
= R and A, = A \ R, so g(b) /E A,. Thus, n > 1, so A, = g(f (A,-1)), and 
therefore we can choose some a E A,_, such that g(f (a)) = g(b). But then 
since g is one-to-one, f(a) = b. Since a E A, 1, a E X, so h(a) = f(a) = b. 
Thus, b E Ran(h). 

Case 2. g(b) E Y. Then h(g(b)) = g! (g(b)) = b, so b E Ran(h). 

In both cases we have b E€ Ran(h), so h is onto. 0 


The Cantor-Schréder-Bernstein theorem is often useful for showing that 
sets are equinumerous. For example, in exercise 3 of Section 8.1 you were 
asked to show that (0, 1] ~ (0, 1), where 


(0,1]={xER]0 <x < 1) 


and 


(0,1) ={xER]0 <x < 1}. 


It is surprisingly difficult to find a one-to-one correspondence between 
these two sets, but it is easy to show that they are equinumerous using the 
Cantor-Schröder-Bernstein theorem. Of course, (0, 1) & (0, 1], so clearly 
(0, 1) < (0, 1]. For the other direction, define f: (0, 1] > (0, 1) by the 
formula f(x) = x/2. It is easy to check that this function is one-to-one 
(although it is not onto), so (0, 1] < (0, 1). Thus, by the Cantor-Schréder- 
Bernstein theorem, (0, 1] ~ (0, 1). For more on this example see exercise 9. 


Our next theorem gives a more surprising consequence of the Cantor- 
Schr6éder-Bernstein theorem. 


Theorem 8.3.3. R ~ A(Z”). 


It is quite difficult to prove Theorem 8.3.3 directly by giving an example 
of a one-to-one, onto function from R to Z(Z*). In our proof we’ll use the 
Cantor-Schröder-Bernstein theorem and the following lemma. 


Lemma 8.3.4. Suppose x and y are real numbers and x < y. Then there is a 
rational number q such that x < q < y. 


Proof. Let k be a positive integer larger than 1/(y -x). Then 1/k < y-x. We 
will show that there is a fraction with denominator k that is between x and y. 

Let m and n be integers such that m < x < n, and let S = {j © N | m+j/k > 
x}. Note that m+k(n—m)/k = n > x, and therefore k(n-m) E S. Thus S # Ø, 
so by the well-ordering principle it has a smallest element. Let j be the 
smallest element of S. Note also that m + 0/k = m < x, so 0 /E S, and 
therefore j > 0. Thus, j — 1 is a natural number, but since j is the smallest 
element of S, j — 1/€ S. It follows that m + (j — 1)/k < x. 


Let q = m + j/k. Clearly q is a rational number, and since j E S, q = m + 
j/k > x. Also, combining the observations that m + (j — 1)/k < x and 1/k < y - 
x, we have 


j j— 1 l 
=e. =e Ts <x+(y-x)=y. 


k k 


Thus, we have x < q < y, as required. O 


Proof of Theorem 8.3.3. As we noted earlier, we already know that A(Z*) 
< R. Now consider the function f: R > (Q) defined as follows: 


f(x) ={¢ €Q|q <x}. 


We claim that f is one-to-one. To see why, suppose x E R, y E R, and x # y. 


Then either x < y or y < x. Suppose first that x < y. By Lemma 8.3.4, we can 
choose a rational number q such that x < q < y. But then q € f(y) and q/€ 


f(x), so f(x) # f(y). A similar argument shows that if y < x then f(x) # f(y), so 
f is one-to-one. 


Since f is one-to-one, we have shown that R < A(Q). But we also know 
that Q ~ Z*, so by exercise 5 in Section 8.1 it follows that A(Q) ~ AZ’). 
Thus, R $ AQ) 3 AZ’), so by transitivity of S (see exercise 1) we 
have R < A(Z*). Combining this with the fact that A(Z*) < R and 
applying the Cantor-Schréder-Bernstein theorem, we conclude that R ~ 
AZ*).0 


We said at the beginning of this chapter that we would show that infinity 
comes in different sizes. We now see that, so far, we have found only two 
sizes of infinity. One size is represented by the denumerable sets, which are 
all equinumerous with each other. The only examples of nondenumerable 
infinite sets we have given so far are “A(Z") and R, which we now know 


are equinumerous. In fact, there are many more sizes of infinity. For 
example, (IR) is an infinite set that is neither denumerable nor 


equinumerous with R. Thus, it represents a third size of infinity. For more 
on this see exercise 8. 
Because Z* < R, it is natural to think of the set of real numbers as larger 


than the set of positive integers. In 1878, Cantor asked whether there was a 
size of infinity between these two sizes. More precisely, is there a set X 
such that Z* < X < R? Cantor conjectured that the answer was no, but he 


was unable to prove it. His conjecture is known as the continuum 
hypothesis. At the Second International Congress of Mathematicians in 
1900, David Hilbert (1862-1943) gave a famous lecture in which he listed 
what he believed to be the most important unsolved mathematical problems 
of the time, and the proof or disproof of the continuum hypothesis was 
number one on his list. 

The status of the continuum hypothesis was “resolved” in a remarkable 
way by the work of Kurt Gödel (1906—1978) in 1939 and Paul Cohen 
(1934-2007) in 1963. The resolution turns out to require even more careful 
analyses than we have given in this book of both the notion of proof and the 
basic assumptions underlying set theory. Once such analyses have been 
given, it is possible to prove theorems about what can be proven and what 


cannot be proven. What Gödel and Cohen proved was that, using the 
methods of mathematical proof and set-theoretic assumptions accepted by 
most mathematicians today, it is impossible to prove the continuum 
hypothesis, and it is also impossible to disprove it! 


Exercises 


*1. 
(a) 
(b) 

2. 
(a) 
(b) 


Prove that < is reflexive and transitive. In other words: 
For every set A, A SA. 

For all sets A, B, and C, if A S Band B %3 C then A SC. 
Prove that < is irreflexive and transitive. In other words: 


For every set A, A < A. 
For all sets A, B, and C, if A < Band B < C then A < C. 


Suppose A € B © C and A ~ C. Prove that B ~ C. 

Suppose A < B and C SD. 

Prove that A x C S B x D. 

Prove that if A and C are disjoint and B and D are disjoint, then A U C 
S BUD. 

Prove that A(A) S A(B). 

For the meaning of the notation used in this exercise, see exercise 23 of 
Section 8.1. Suppose A S Band C SNS D. 

Prove that if A # Ø then“ C < ® D. 

Is the assumption that A # © needed in part (a)? 

(a) Prove that if A < B and Bis finite, then A is finite and |A| < |B]. 
Prove that if A < B and B is finite, then A is finite and |A] < |B]. 


Prove that for every set A, A < (A). (Hint: See exercise 5 of Section 
8.2. Note that in particular, if A is finite and |A| = n then, as you showed 
in exercise 11 of Section 6.2, and again in exercise 6(c) of Section 8.2, 
| A(A)| = 2”. It follows, by exercise 6(b), that 2” > n. Of course, you 
already proved this, by a different method, in exercise 12(a) of Section 
6.3.) 


*8, 


Let A, = Z*, and for all n E Z* let A,,, = AA,). 


(a) Prove that for all n E Z* and m € Z*, if n < mthen A< A. 


(b) The sets A,, for n E Z*, represent infinitely many sizes of infinity. Are 
n y y y 


10. 


(a) 
(b) 


(c) 
11. 


12. 


(b) 


there any more sizes of infinity? In other words, can you think of an 
infinite set that is not equinumerous with A, for any n € Z*? 


The proof of the Cantor-Schréder-Bernstein theorem gives a method for 
constructing a one-to-one and onto function h: A > B from one-to-one 
functions f: A > Band g: B > A. Use this method to find a one-to-one, 
onto function h: (0, 1] > (0, 1). Start with the functions f: (0, 1] — (0, 
1) and g: (0, 1) > (0, 1] given by the formulas: 


fQx)=-, g(x) =x. 


Nm] = 


Let £= {R | R is an equivalence relation on Z* }. 


Prove that €$ A(Z"). 

Let A = Z* \{1, 2} and let Abe the set of all partitions of Z*. Define f: 
P(A) > Pby the formula AX) = {X U {1}, (A\ X) U {2}}. Prove that 
f is one-to-one. 

Prove that € ~ A(Z*). 

Let 7= {R | R is a total order on Z* }. Prove that 7~ A(Z*). (Hint: 


Imitate the solution to exercise 10.) 

(a) Prove that if A has at least two elements and A x A ~ A then (A) 
x PRA) ~ AA). 

Prove that R x R ~ R. 


13. An interval is a set I © R with the property that for all real numbers x, 


y, and z, if x E I, z E I, and x < y < z, then y € I. An interval is 
nondegenerate if it contains at least two different real numbers. 
Suppose .Fis a set of nondegenerate intervals and Zis pairwise disjoint. 


Prove that 7 is countable. (Hint: By Lemma 8.3.4, every nondegenerate 


interval contains a rational number.) 


14. For the meaning of the notation used in this exercise, see exercise 23 of 
Section 8.1. 


(a) Prove that® R ~ A(R). 

(b) Prove that2 R ~ R. 

(c) (For readers who have studied calculus.) Let C = {f E P R | f is 
continuous}. Prove that C ~ R. (Hint: Show that if f and g are 
continuous functions and Vx E Q(f (x) = g(x)), then f = g.) 


1 We should really be a bit more careful here. It is actually possible for two different decimal 
expansions to represent the same number. For example, in a calculus class you may have 
learned the surprising fact that 0.999 ... = 1.000 . . . . However, this only happens with 
decimal expansions that end with either an infinite sequence of 9s or an infinite sequence of 


Os. For decimal expansions made up of 3s and 7s, different decimal expansions always 
represent different numbers. 


Appendix 


Solutions to Selected Exercises 


Introduction 


1. (a) One possible answer is 32,767 = 31 - 1057. 

(b) One possible answer is x = 2°! — 1 = 2,147,483,647. 
3. (a) The method yields the prime number 211. 

(b) The method yields two primes, 3 and 37. 


Chapter 1 


Section 1.1 


1. (a) (RV H) A -7(H A T) where R stands for the statement “We’ll 
have a reading assignment,” H stands for “We’ll have homework 
problems,” and T stands for “We’ll have a test.” 

(b) =G V (G A aS), where G stands for “You’ll go skiing,” and S stands 

for “There will be snow.” 

(Cc) iV? < 2) v (V7 =2). 

6. (a) Iwon’t buy the pants without the shirt. 

(b) I won’t buy the pants and I won’t buy the shirt. 

(c) Either I won’t buy the pants or I won’t buy the shirt. 


Section 1.2 
1. (a) P Q -PVQ 
F F T 
F T T 
T F F 
T T T 
(b) S G (SVG)A(7~SV 7G) 
F F F 
F T T 
T F T 
T T F 
5. (a) P Q P}ỌQ 
F F T 
F T F 
T F F 
T T F 
(b) ~(P V Q). 


(c) ~P is equivalent to P | P, P V Q is equivalent to (P | Q) ! (P 1 Q), 
and P A Q is equivalent to (P | P) ! (Q 1 Q). 


7. (a) and (c) are valid; (b) and (d) are invalid. 
9. (a) is neither a contradiction nor a tautology; (b) is a contradiction; 
(c) and (d) are tautologies. 


11. (a) PVQ. 
(b) P. 
(c) =-PVQ. 


14. We use the associative law for A twice: 


[PA(QA R)]AS is equivalent to[(P AQ)A R] AS 


which is equivalent to (P A Q) A (RA 5). 


16. PV aQ. 


Section 1.3 


1. (a) D(6) A D(9) A D(15), where D(x) means “x is divisible by 3.” 
(b) D(x, 2) A D(x, 3) A =D(x, 4), where D(x, y) means “x is divisible by 


(c) N(x) A NY) A [(P (X) A aP) V (P Y) A =P(X))], where N(x) means 
“x is a natural number” and P(x) means “x is prime.” 

3. (a) {x|x isa planet}. 

(b) {x|x is an Ivy League school}. 

(c) {x|xisastate in the United States}. 

(d) {x|xX is a province or territory in Canada}. 

5. (a) (-3 © R) A (13 - 2-3) > 1). Bound variables: x; no free 

variables. This statement is true. 

(b) (4€ R) A (4 <0) A (13 - 2(4) > 1). Bound variables: x; no free 
variables. This statement is false. 

(© =[(5 E R) A (13 - 2(5) > c)]. Bound variables: x; free variables: c. 


8. (a) {x | Elizabeth Taylor was once married to x} = {Conrad Hilton 
Jr., Michael Wilding, Michael Todd, Eddie Fisher, Richard 
Burton, John Warner, Larry Fortensky}. 


(b) {x|x is a logical connective studied in Section 1.1} = { A, V, +}. 
(c) {x|x is the author of this book} = {Daniel J. Velleman}. 


Section 1.4 


1. (a) {3,12}. 
(b) {1, 12, 20, 35}. 
(c) {1, 3, 12, 20, 35}. 
The sets in parts (a) and (b) are both subsets of the set in part (c). 
4. (a) Both Venn diagrams look like this: 


(b) Both Venn diagrams look like this: 


9. Sets (a), (d), and (e) are equal, and sets (b) and (c) are equal. 
12. (a) There is no region corresponding to the set (A n D) \ (B U ©), 
but this set could have elements. 


(b) Here is one possibility: 


14. The Venn diagrams for both sets look like this: 


Section 1.5 
1. (a) (SV ~E) > ~H, where S stands for “This gas has an unpleasant 
smell,” E stands for “This gas is explosive,” and H stands for 
“This gas is hydrogen.” 
(b) (F A H) > D, where F stands for “George has a fever,” H stands for 
“George has a headache,” and D stands for “George will go to the 


doctor.” 
(©) (F > D) A (H > D), where the letters have the same meanings as in 


art (b). 
(d) (x # 5 > (P (x) > O(x)), where P(x) stands for “x is prime” and 
O(x) stands for “x is odd.” 
4. (a) and (b) are valid, but (c) is invalid. 
7. (a) Either make a truth table, or reason as follows: 
(P > R)A(Q— R) is equivalent to (~P V R) A (>Q V R) 

which is equivalent to (œP A7-~Q)V R 
which is equivalent to ~(P v Q) V R 


which is equivalent to (P V Q) > R 


(b) (P > R) V (Q > R) is equivalent to (P A Q) > R. 
9. =(P > ~Q). 


Chapter 2 


Section 2.1 


1; 


(a) Vx[JyF(x, y) > S(x)], where F(x, y) stands for “x has forgiven 
y,” and S(x) stands for “x is a saint.” 


=Ax[C(x) A Vy(D(y) > S(x, y))], where C(x) stands for “x is in the 
calculus class,” D(y) stands for “y is in the discrete math class,” and 
S(x, y) stands for “x is smarter than y.” 

Vx(=(x = m) > L(x, m)), where L(x, y) stands for “x likes y,” and m 
stands for Mary. 

Jx(P(x) A SG, x)) A dy(P(y) A S(r, y)), where P(x) stands for “x is a 
police officer,” S(x, y) stands for “x saw y,” j stands for Jane, and r 
stands for Roger. 

4x(P(x) A S(j, x) A S(r, x)), where the letters have the same meanings 
as in part (d). 


4. (a) All unmarried men are unhappy. 

(b) yis a sister of one of x’s parents; i.e., y is x’s blood aunt. 

8. (a), (d), and (e) are true; (b), (c), and (f) are false. 

Section 2.2 

1. (a) AxLM(x) A Vy(F(x, y) > -=H(y))], where M(x) stands for “x is 

majoring in math,” F(x, y) stands for “x and y are friends,” and 
H(y) stands for “y needs help with his or her homework.” In 
English: There is a math major all of whose friends don’t need 
help with their homework. 

(b) AxVy(R(x, y) > AzL(y, z)), where R(x, y) stands for “x and y are 
roommates” and L(y, z) stands for “y likes z.” In English: There is 
someone all of whose roommates like at least one person. 

(c) Axx EAVXEB)A(XXECVXED). 

(d) Vxdyly > x A V2(z* + 5z#y)]. 


4. Hint: Begin by replacing P(x) with —P(x) in the first quantifier 
negation law, to get the fact that ~Ax-P(x) is equivalent to Vx-—P(x). 

6. Hint: Begin by showing that dx(P(x) V Q(x)) is equivalent to =Wx-(P 
(x) V Q(x). 

8. (Wx © AP (X) A (Vx € BP (2)) 


is equivalent to Yx(x € A > P(x)) AWx(x € B > P(x)) 
which is equivalent to Yx[(x € A > P(x)) A(x € B —> P(x))] 
which is equivalent to Wx[(x ¢ A v P(x)) A (x € B v P(x))] 
which is equivalent to Yx[(x é A Ax é B) Vv P(x)] 
which is equivalent to Yx[>(x € AV x € B) v P(x)] 
which is equivalent to Vx[x ¢ (A U B) v P(x)] 
which is equivalent to Yx[x € (A U B) > P(x)] 
which is equivalent to Yx € (A U B) P(x). 


11. A\B= Ø is equivalent to ~3x(x E A A x É B) 


which is equivalent to Yx=(x € AA x ¢ B) 
which is equivalent to Vx(x ¢ A V x € B) 
which is equivalent to Vx(x € A > x € B) 
which is equivalent to A C B. 


14. An B= Ø is equivalent to sAx(x E A A x E B) 

which is equivalent to Vx-(x € A Ax € B) 

which is equivalent to Vx(x ¢ AV x é B) 

which is equivalent to Vx(x € A > x ¢ B) 

which is equivalent to Vx((x ¢ B Ax € A) e x € A) 

(by Section 1.5 exercise 11(b)) 

which is equivalent to Vx(x € A \ B @ x € A) 
which is equivalent to A \ B = A. 


Section 2.3 
1. (a) Vx(x E F > Vy(y © x > y GE A)). 


(b) Vxxx EA > An E N(x= 2n + 1)). 
(c) Vn E Nam E N(n?+n+1=2m+ 1). 


(d) Ix(Vy(y Ex > Ji E I (y EA)) AVIETAyy E x A y EA))). 

4. (F = {red, blue} and |J F = {red, green, blue, orange, purple}. 

8. (a) Ay = {2, 4}, Ay = (3, 6}, By = {2, 3}, By = (3, 4}. 

(b) Mies (Ai U B;) = {3,4} and (Dies Ai) U (Mier Bi) = {3}. 

(c) They are not equivalent. 

12. One example is A = {1, 2} and B = {2, 3}. 

14. (a) B= {1, 2, 3, 4, 5} and B, = {1, 2, 4, 5, 6}. 

(b) Njes Bj = {1,2,4,5}. 

(c) Uier(Mjes Aij) = {1,2,4}. This is not equal to the set in part (b). 

(d) * € fVjey(Uier Aij) means Vj E J Ji E I (x E Aj,) and x € 
Uier(Mljes Aij) means Ji E Wj E J(x E A;j). They are not 
equivalent. 


Chapter 3 


Section 3.1 


P 


(a) Hypotheses: n is an integer larger than 1 and n is not prime. 
Conclusion: 2” — 1 is not prime. The hypotheses are true when n 
= 6, so the theorem tells us that 2° — 1 is not prime. This is 
correct, since 26 - 1 = 63 =9-7. 


(b) We can conclude that 32767 is not prime. This is correct, since 32767 


(c) 


4. 


10. 
12. 


15. 


= 151-217. 
The theorem tells us nothing; 11 is prime, so the hypotheses are not 
satisfied. 


Suppose 0 < a < b. Then b - a > 0. Multiplying both sides by the 
positive number b + a, we get (b + a): (b- a) > (b + a) - 0, or in other 
words b? — a? > 0. Since b? — a? > 0, it follows that a? < b?. Therefore 
if 0 <a < b then a° < b°. 


We will prove the contrapositive. Suppose x Æ B. Then since x € A, it 
follows that x € A \ B. But we also know that A \ B © Cn D, so we 
can conclude that x E C n D, and therefore x € D. Thus, if x D 
then x € B. 

Hint: Add b to both sides of the inequality a < b. 

We will prove the contrapositive. Suppose c < d. Multiplying both 
sides of this inequality by the positive number a, we get ac < ad. 
Also, multiplying both sides of the given inequality a < b by the 
positive number d gives us ad < bd. Combining ac < ad and ad < bd, 
we can conclude that ac < bd. Thus, if ac => bd then c > d. 

Since x > 3 > 0, by the theorem in Example 3.2.1, x? > 9. Also, 
multiplying both sides of the given inequality y < 2 by -2 (and 
reversing the direction of the inequality, since —2 is negative) we get 
-2y > —4. Finally, adding the inequalities x? > 9 and -2y >—4 gives us 
x?-—2y >5. 


Section 3.2 


L 


(a) Suppose P. Since P > Q, it follows that Q. But then, since Q > 
R, we can conclude R. Thus, P > R. 


(b) Suppose P. To prove that Q > R, we will prove the contrapositive, so 


g, 


12. 


suppose ~R. Since =R > (P > ~Q), it follows that P > ~Q, and 
since we know P, we can conclude ~Q. Thus, Q > R, so P > (Q > 
R). 
Suppose x € A \ B and x € B\C. Since x € A\ B, x E A and x € B, 
and since x € B\ C, x € B and x €& C. But now we have x € B and x 
É B, which is a contradiction. Therefore it cannot be the case that x € 
A\Bandx€ B\C. 


Suppose a € A \ B. This means that a E€ A anda €& B. Since a E€ A 
anda E C,a EA N C. But then since A n C & B, it follows that a © 
B, and this contradicts the fact that a € B. Thus, a € A \ B. 


Hint: Assume a < 1/a < b < 1/b. Now prove that a < 1, then use this 
fact to prove that a < 0, and then use this fact to prove that a < —1. 
(a) The sentence “Then x = 3 and y = 8” is incorrect. (Why?) 


(b) One counterexample is x = 3, y = 7. 


15. P Q R P—(Q—> R) =R > (P > ~Q) 
F F F T T 
F F T i T 
F T F T 1 
F T T T T 
T F Ef T T 
T F YJ T T 
T T F F F 
T T T T T 
Section 3.3 
1. Suppose Ax(P(x) > Q(x)). Then we can choose some xp such that 


P(X) > Q(x). Now suppose that VxP(x). Then in particular, P(x9), 
and since P(xọ) > Q(xo), it follows that Q(x,). Since we have found a 


particular value of x for which Q(x) holds, we can conclude that 
4xQ(x). Thus VxP(x) > AxQ(x). 


ie 


14. 


ive 


20. 


Suppose that A © B \ C, but A and C are not disjoint. Then we can 
choose some x such that x E A and x € C. Since x E Aand A © B\ 
C, it follows that x € B \C, which means that x € B and x € C. But 
now we have both x € C and x É C, which is a contradiction. Thus, if 
A © B\ C then A and C are disjoint. 


Suppose x > 2. Let y = (x + vx? —4)/2, which is defined since x? - 
4 > 0. Then 


+ l x+J/x2-4 + 2 2x? + 2xJ/x2 —4 
y — o ““_“__—_—___ e i Á e b 
l y 2 x+vV/x2-4 2(x + Vx2 — 4) 


Suppose F is a family of sets and A E F. Suppose x e (|. Then by 
the definition of () F, since x € ()F and A € F, x € A. But x was an 


arbitrary element of (|, so it follows that Q) F € A. 
Hint: Assume F S G and let x be an arbitrary element of |J F. You 


must prove that x e (JG, which means 4A E G(x E A), so you 
should try to find some A € G such that x € A. To do this, write out 


the givens in logical notation. You will find that one of them is a 
universal statement, and one is existential. Apply existential 
instantiation to the existential one. 


Suppose x € (J;e; A(A;). Then we can choose some i € I such that x 
E Y(A;), or in other words x S A;. Now let a be an arbitrary element 
of x. Then a € A; and therefore a € (je; Ai. Since a was an 
arbitrary element of x, it follows that x € U;e; Ai, which means that 
x € P(Uie Ai). Thus Ujer A(Ai) © A(Uj<; Ai). 

Hint: The last hypothesis means VA E€ F VB €E G(A C B), so if in the 
course of the proof you ever come across sets A E F and B € G, you 


can conclude that A © B. Start the proof by letting x be arbitrary and 
assuming x € |J F, and prove that x € {|G. To see where to go from 
there, write these statements in logical symbols. 


The sentence “Then for every real number x, x? < 0” is incorrect. 
(Why?) 


22. Based on the logical form of the statement to be proven, the proof 
should have this outline: 


Let x =... 
Let y be an arbitrary real number. 
~ -> 9 
[Proof of xy“ = y — x goes here.] 
. . Seal > 
Since y was arbitrary, Vy € R(xy- = y — x). 
= Ta > 
Thus, 3x € RVy e R(xy* = y — x). 


This outline makes it clear that y should be introduced into the 
proof after x. Therefore, x cannot be defined in terms of y, because y 
will not yet have been introduced into the proof when x is being 
defined. But in the given proof, x is defined in terms of y in the first 
sentence. (The mistake has been disguised by the fact that the 
sentence “Let y be an arbitrary real number” has been left out of the 
proof. If you try to add this sentence to the proof, you will find that 
there is nowhere it could be added that would lead to a correct proof 
of the incorrect theorem. ) 

25. Here is the beginning of the proof: Let x be an arbitrary real number. 
Let y = 2x. Now let z be an arbitrary real number. Then... . 


Section 3.4 


1. (>) Suppose Vx(P(x) A Q(x)). Let y be arbitrary. Then since Vx(P(x) 
A Q(x)), P(Y) A Q(y), and so in particular P(y). Since y was arbitrary, 
this shows that VxP(x). A similar argument proves VxQ(x): for 
arbitrary y, P(y) A Q(y), and therefore Q(y). Thus, VxP(x) A VxQ(). 

(-) Suppose VxP(x) A VxQ(x). Let y be arbitrary. Then since 
VxP(x), P(y), and similarly since VxQ(x), Q(y). Thus, P(y) A Q(y), 
and since y was arbitrary, it follows that Vx(P(x) A Q(x)). 

4. Suppose that A S B and A É C. Since A É C, we can choose some a 
€ A such that a € C. Since a E A and A © B, a € B. Since a € B 
anda É C, BẸ C. 

7. Let A and B be arbitrary sets. Let x be arbitrary, and suppose that x E 
P(A N B). Then x S A n B. Now let y be an arbitrary element of x. 
Then since x S An B, y E A N B, and therefore y € A. Since y was 


13. 


16. 


18. 


(b) 


(c) 


arbitrary, this shows that x © A, so x E MA). A similar argument 
shows that x © B, and therefore x E A(B). Thus, x E P(A) Nn IXB). 


Now suppose that x E A(A)n AB). Then x E MA) and x € 
P(B), so x © A and x © B. Suppose that y € x. Then since x S A and 
x G B, y E A andy E B, so y E A n B. Thus, x S A N B, sox E 
P(A N B). 

Suppose that x and y are odd. Then we can choose integers j and k 
such that x = 2j + 1 and y = 2k + 1. Therefore xy = (2j + 1)(2k + 1) = 
4jk + 2j + 2k + 1 = 2(2jk + j + k) + 1. Since 2jk + j + k is an integer, it 
follows that xy is odd. 

Hint: Let x © R be arbitrary, and prove both directions of the 


biconditional separately. For the “—” direction, use existential 
instantiation and proof by contradiction. For the “<” direction, 
assume that x 1 and then solve the equation x + y = xy for y in order 
to decide what value to choose for y. 

Suppose that |J F and f) Ẹ are not disjoint. Then we can choose some 
x such that x e (JF and x €()G. Since x e JF. we can choose 
some A E F such that x E A. Since we are given that every element 


of F is disjoint from some element of G, there must be some B € G 


such that ANB = Ø. Since x € A, it follows that x € B. But we also 
have x €{)G and B E G, from which it follows that x E B, which is a 


contradiction. Thus, |_) F and ()G must be disjoint. 
(a) Suppose x € \)(*#MG). Then we can choose some A © Fn G 
such that x E€ A. Since A E Fn G,A E Fand A €E G. Since x E€ 


A and AeF,xeĻ(JF, and similarly since x E A and 
AéegG,x elg. Therefore, x e(UF)A(UG). Since x was 
arbitrary, this shows that |F NG) c UPNA (U9). 


The sentence “Thus, we can choose a set A such that A E F, A E G, 


and x € A” is incorrect. (Why?) 
One example is F = {{1}, {2}}, G = {{1}, {1, 2}}. 


22. Suppose that JF ¢ JG. Then there is some x € |J F such that x ¢ 
UG. Since x e (J F, we can choose some A E F such that x E A. 


Now let B € G be arbitrary. If A S B, then since x E A, x E B. But 


then since x € Band B € G, x e [J G, which we already know is false. 
Therefore A É B. Since B was arbitrary, this shows that for all B € G, 


A €& B. Thus, we have shown that there is some A E F such that for 
alBEG,AEZB. 


24. (a) Suppose x € \);.,(Ai \ Bi). Then we can choose some i € I 
such that x € A; \ B; which means x € A; and x € B,. Since x © 
A, %*€Uje Ais and since x ¢ Bi,x ¢();<;Bi. Thus, 
x € (Uic Ai) \ Mier Bi). 

(b) One example is J = {1, 2}, A; = B, = {1}, Ay = By = {2}. 


Section 3.5 


1. Suppose x E An (BU C). Then x € A, and either x E B or x E C. 
Case 1.x E B. Then since x E A,x G An B,sox E (An B) UC. 
Case 2. x E C. Then clearly x E (A n B) UC. 
Since x was arbitrary, we can conclude that An(B U C) & 
(AnB)UC. 
5. Suppose x € A. We now consider two cases: 


Case 1.x E C. Then x E AnC,sosinceeAnC GE BnNC,xEBn 
C, and therefore x € B. 


Case 2.x € C. Since x E A,x EA U C, sosince AU CSE BUC,x 
€ B U C. But x É C, so we must have x € B. 


Thus, x € B, and since x was arbitrary, A © B. 

8. Hint: Assume x E A(A) U A(B), which means that either x E P(A) 
or x E A(B). Treat these as two separate cases. In case 1, assume x © 
P(A), which means x © A, and prove x E “(A U B), which means x 
G A U B. Case 2 is similar. 

12. Let x be an arbitrary real number. 


16. 


(b) 
20. 


(-) Suppose |x — 4| > 2. 

Case 1. x — 4 > 0. Then |x - 4| = x — 4, so we have x - 4 > 2, and 
therefore x > 6. Adding x to both sides gives us 2x > 6 + x, so 2x —6 > 
x. Since x > 6, this implies that 2x —6 is positive, so |2x —6| = 2x - 6 > 
x 

Case 2. x — 4 < 0. Then |x - 4| = 4 — x, so we have 4 - x > 2, and 
therefore x < 2. Therefore 3x < 6, and subtracting 2x from both sides 
we get x < 6 — 2x. Also, from x < 2 we get 2x < 4, so 2x - 6 < -2. 
Therefore 2x — 6 is negative, so |2x — 6| = 6 — 2x > x. 

(>) Hint: Imitate the “<—” direction, using the cases 2x — 6 = 0 and 
2x-6<0. 

(a) Suppose x € | )(# UG). Then we can choose some AE F U G 


such that x © A. Since A E FUG, either A E ForA E G. 


Case 1. A E F. Since x E A and A e F,x e UF, sox € 


UPU (UG). 
Case 2. A € G. Since x E A and A € G,x e UG, sox € 


UFU (UG). 

Thus, x € (UF) U (Ug). 

Now suppose that x e (UF) U (UG). Then either x e [JF or 
x elg. 


Case 1. x € |J F. Then we can choose some A E F such that x E 
A. Since A E F, A E F U G, so since x € A, it follows that 


xeEl\(F UG). 
Case 2. x e ()G.A similar argument shows that x € (FU). 
Thus, x e U(F UG). 
The theorem is: (F UG) = (QF) N (9). 
(>) Suppose that A A B and C are disjoint. Let x be an arbitrary 
element of An C. Then x € A and x € C. If x & B, then since x E A, 
x © A \ B, and therefore x E A A B. But also x €E C, so this 
contradicts our assumption that A A B and C are disjoint. Therefore x 
€E B. Since we also know x €E C, we have x © B n C. Since x was an 


2d: 


24. 


(b) 
27. 


29, 
al; 


arbitrary element of A n C, this shows that A n C © B nì C. A similar 
argument shows that Bn CGE ANC. 

(-) Suppose that A Nn C = Bn C. Suppose that A A B and C are 
not disjoint. Then we can choose some x such that x E A A Band x E 
C. Since x E A A B, either x E A\ Borx E B\A. 

Case 1. x E A \ B. Then x € A and x ¢ B. Since we also know x € 
C, we can conclude that x € A n C but x € B n C. This contradicts 
the fact that A n C= B Nì C. 

Case 2. x E B\ A. Similarly, this leads to a contradiction. 

Thus we can conclude that A A B and C are disjoint. 

(a) Hint: Suppose x € A \ C, and then break the proof into cases, 
depending on whether or not x € B. (b) Hint: Apply part (a). 

(a) Suppose x € (A U B) A C. Then either x E (A U B)\Corx €E 
C\(AU B). 

Case 1. x € (A U B) \ C. Then either x € A or x € B, and x É C. 
We now break case 1 into two subcases, depending on whether x © 
Aorx€B: 

Case la. x E A. Thenx E A\C,sox €AAC,sox€ (AAC) U 
(BAC). 

Case 1b. x E B. Similarly, x E BA C,sox E (AAC) U (BAC). 

Case 2. x E C \ (A U B). Then x € C, x EA, and x É B. It follows 
that x E A A C and x E BAC, so certainly x E (A A C) U (BAC). 
Here is one example: A = {1}, B = {2}, C = {1, 2}. 

The proof is incorrect, because it only establishes that either 0 < x or x 


< 6, but what must be proven is that 0 < x and x < 6. However, it can 
be fixed. 


The proof is correct. 


Hint: Here is a counterexample to the theorem: A = {1, 2}, B= {1}, C 
= {2}. 


Section 3.6 


1. Let x be an arbitrary real number. Let y = x/(x* + 1). Then 


6. 


(b) 
11. 


3 3 

x xX” +X x x" > x 4 
— 
x44 1 


x-y=x- Yel £el Yel” FT ene y 
To see that y is unique, suppose that x? z = x — z. Then z(x* + 1) = x, 

and since x? + 1 # 0, we can divide both sides by x? + 1 to conclude 

that z = x/(x? + 1) =y. 

Suppose x # 0. Let y = 1/x. Nowlet z be an arbitrary real number. Then 

zy = Z(1/x) = z/x, as required. 

To see that y is unique, suppose that y' is a number with the 
property that Vz E€ R(zy' = z/x). Then in particular, taking z = 1, we 
have y' = 1/x, so y' = y. 

(a) LetA=@ E ZU). Then clearly for any B E AU), A U B= Ø 
U B=B. 

To see that A is unique, suppose that A' E ZU) and for all B € 
AU), A' U B = B. Then in particular, taking B = ©, we can 
conclude that A' U Ø = Ø. But clearly A’ U Ø =A’, so we have A’ = 
O=A. 

Hint: Let A = U. 

Existence: We are given that for every GC F.\WJ)GeF, so in 
particular, since F C F,|)F e F. Let A = |JF. Now suppose B €E 
F. Then by exercise 8 of Section 3.3, B € |J F = A, as required. 


Uniqueness: Suppose that A, E F, A, E F, VB E FB S Aj), and 
VB E FB S A). Applying this last fact with B = A, we can 


conclude that A, © A>, and similarly the previous fact implies that A, 
= Aj. Thus Ay = A». 


Section 3.7 


1. 


Hint: Comparing (b) to exercise 16 of Section 3.3 may give you an 
idea of what to use for A. 

Suppose P(U;ic Ai) E Uj-; Z (Ai). It is clear that Ujer Ai S Uic Ai: 
so Uer Ai € P(U;c Ai) and therefore Ujer Ai € Ujer A(Ai). By the 
definition of the union of a family, this means that there is some i € I 


such that (J;e; Ai © A;. Now let j E I be arbitrary. Then by exercise 8 
in Section 3.3, Aj © U;e Ai. s0 Aj S Aj. 


8. Suppose that lim, „o f(x) = L > 0. Let € = L. Then by the definition of 
limit, we can choose some 6 > 0 such that for all x, if 0 < |x -—c| <6 
then |f(x) — L| < € = L. Now let x be an arbitrary real number and 
suppose 0 < |x — c| < 6. Then |f(x) — L| < L, so -L < f(x) - L < L and 
therefore 0 < f(x) < 2L. Therefore, for every real number x, if 0 < |x - 
c| < 6 then f(x) > 0. 

10. The proof is correct. 


Chapter 4 


Section 4.1 


1. (a) {(x, y) EP x P| x is a parent of y} = {(Bill Clinton, Chelsea 
Clinton), (Goldie Hawn, Kate Hudson), . . .}. 

(b) {(x, y) E C x U | there is someone who lives in x and attends y}. If 

you are a university student, then let x be the city you live in, and let 

y be the university you attend; (x, y) will then be an element of this 


truth set. 
4. Ax(Bn C)=(AXxB)n (Ax C)= {(1, 4), (2, 4), (3, 9}, 
Ax(BUC) =(Ax B)U(AxC) = {(1, 1), (2, 1), 3, 1), 1,3), (2, 3), 
(3,3), (1. 4), (2, 4), (3, 4}, 

(Ax B)N(C x D)=(ANC) x (BAN D)= Ø, 

(Ax B)U(Cx D) = {(1, 1),(2, 1),03, 1.1, 4,02, 4).(3, 4),(3, 5).(4,5)}, 
(AUC) x (BUD) = {(1, 1),(2, 1),(3, 1).(4, 1),(1,4),(2,4),(3,4),(4,4), 

(1,5), (2,5), (3,5), (4, 5)}. 


The cases are not exhaustive. 


Yes, it is true. 


10. Suppose (x, y) E (A \ C) x (B\ D). Then x E A \ Candy E B\ D, 
which means x € A, x € C, y € B, andy Ẹ D. Since x € A and y € 
B, (x, y) E A x B. And since x € C, (x, y) É C x D. Therefore (x, y) 
€ (A x B) \ (C x D). 


15. The theorem is incorrect. Counterexample: A = {1}, B= C= D= Ø. 
Notice that A Æ C. Where is the mistake in the proof that A © C? 


Section 4.2 


1. (a) Domain = {p EP |p has a living child}; Range = {p € P | p has 
a living parent}. 
(b) Domain = R; Range = R+. 


5. (a) {(1, 4), (1, 5), (1, 6), (2, 4), (3, 6)}. 


(b) 1(4, 4), (5, 5), (5, 6), (6, 5), (6, 6)}. 
8. EcECF., 
11. We prove the contrapositives of both directions. 


(>) Suppose Ran(R) and Dom(S) are not disjoint. Then we can 
choose some b E Ran(R) n Dom(S). Since b © Ran(R), we can 
choose some a € A such that (a, b) E R. Similarly, since b E 
Dom(S), we can choose some c €E C such that (b, c) E S. But then (a, 
c) E S° R, so S° Rž Ø. 

(-) Suppose S ° R # Ø. Then we can choose some (a, c) E S ° R. 
By the definition of S ° R, this means that we can choose some b E B 
such that (a, b) E R and (b, c) E S. But then b E Ran(R) and b €E 
Dom(S), so Ran(R) and Dom(S) are not disjoint. 


Section 4.3 
L 


i, 


10. 


123 4 

OOOO 
S ° R = {(a, y), (a, z), (b, x), (c, y), (c, z)}. 
(—) Suppose R is reflexive. Let (x, y) be an arbitrary element of iy. 
Then by the definition of i,, x = y E A. Since R is reflexive, (x, y) = 
(x, x) E R. Since (x, y) was arbitrary, this shows that i, S R. 


(-) Suppose i, S R. Let x E A be arbitrary. Then (x, x) E i4, so 
since i, & R, (x, x) E R. Since x was arbitrary, this shows that R is 
reflexive. 

Suppose (x, y) E ip. Then x = y E D = Dom(S), so there is some z € 
A such that (x, z) E S. Therefore (z, x) E S~t, so (x, y) = (x, x) E Ste 
S. Thus, ip S S~t} o S. The proof of the other statement is similar. 


13. (a) Yes. To prove it, suppose R, and R, are reflexive, and suppose a 
€ A. Since R} is reflexive, (a, a) E R,, so (a, a) E R} U R3. 


(b) Yes. To prove it, suppose R} and R, are symmetric, and suppose (x, y) 
E R} U Rp. Then either (x, y) E R} or (x, y) E Ro. If (x, y) E R, then 
since R, is symmetric, (y, x) E R4, so (y, x) © R, U R. Similar 
reasoning shows that if (x, y) E R, then (y, x) E R, U R}. 

(c) No. Counterexample: A = {1, 2, 3}, R4 = {(1, 2)}, Ro = {(Q, 3)}. 


17. First note that by part 2 of Theorem 4.3.4, since R and S are 
symmetric, R= R`} and S = S~t. Therefore 


Ro S is symmetric iff R o S = (R o S)! (Theorem 4.3.4, part 2) 
iff R o S = S7! o R7! 


RoS=SoR. (Theorem 4.2.5, part 5) 


20. Suppose R is transitive, and suppose (X, Y) € S and (Y, Z) € S. To 
prove that (X, Z) E S we must show that Vx E XVz € Z(xRz), so let x 
€ X and z € Z be arbitrary. Since Y € B, Y # ©, so we can choose y 
€ Y. Since (X, Y) € S and (Y, Z) € S, by the definition of S we have 
xRy and yRz. But then since R is transitive, xRz, as required. The 
empty set had to be excluded from B so that we could come up with y 
€ Y in this proof. (Can you find a counterexample if the empty set is 
not excluded?) 

23. Hint: Suppose aRb and bRc. To prove aRc, suppose that X € A \ {a, 
c} and X U {a} E F; you must prove that X U {c} E F. To do this, 


you may find it helpful to consider two cases: b É X or b € X. In the 
second of these cases, try working with the sets X' = (X u {a}) \ {b} 
and X" = (X U{c}) \ {b}. 


Section 4.4 


1. (a) Partial order, but not total order. (b) Not a partial order. (c) 
Partial order, but not total order. 


4. (>) Suppose that R is both antisymmetric and symmetric. Suppose 
that (x, y) E R. Then since R is symmetric, (y, x) E R, and since R is 


11. 


14. 


(b) 


17. 


antisymmetric, it follows that x = y. Therefore (x, y) E i4. Since (x, y) 
was arbitrary, this shows that R © iy. 


(-) Suppose that R © i4. Suppose (x, y) E R. Then (x, y) E i}, so x 


= y, and therefore (y, x) = (x, y) E R. This shows that R is symmetric. 
To see that R is antisymmetric, suppose that (x, y) E R and (y, x) E R 
Then (x, y) E iy, s0 xX = y. 


To see that T is reflexive, consider an arbitrary (a, b) E A x B. Since 
R and S are both reflexive, we have aRa and bSb. By the definition of 
T, it follows that (a, b)T (a, b). To see that T is antisymmetric, 
suppose that (a, b)T (a', b') and (a', b’)T (a, b). Then aRa' and a’ Ra, 
so since R is antisymmetric, a = a’. Similarly, bSb’ and b' Sb, so since 
S is antisymmetric, we also have b = b'. Thus (a, b) = (a’, b’), as 
required. Finally, to see that T is transitive, suppose that (a, b)T (a’, b') 
and (a', b')T (a", b"). Then aRa’ and a' Ra", so since R is transitive, 
aRa". Similarly, bSb' and b' Sb", so bSb", and therefore (a, b)T (a", 
b"). 

Even if both R and S are total orders, T need not be a total order. 
The minimal elements of B are the prime numbers. B has no smallest 
element. 

(a) bis the R-largest element of B 


iff b e B and Yx € B(x Rb) 


iff b € B and Yx € B(bR7'x) 
iff b is the R~'-smallest element of B. 


b is an R-maximal element of B 


iff b e B and ~3x € B(bRx Ab Æx) 

iff b € B and ~3x € B(xRT!b A x Æ b) 

iff b is an R~'-minimal element of B. 
No. Let A = R x R, and let R = {((x, y), (x,y) EA x A| x< x' andy 
< y'}. (You might want to compare this to exercise 8.) Let B = {(0, 0)} 
U ({1} x R). We will leave it to you to check that R is a partial order 


on A, and that (0, 0) is the only minimal element of B, but it is not a 
smallest element. 


21, 


(b) 


(c) 


24. 


(b) 


Ali 


(b) 


(a) Suppose that x E U and xRy. To prove that y E U, we must 
show that y is an upper bound for B, so suppose that b E B. 
Since x © U, x is an upper bound for B, so bRx. But we also 
have xRy, so by the transitivity of R we can conclude that bRy. 
Since b was arbitrary, this shows that y is an upper bound for B. 

Suppose b € B. To prove that b is a lower bound for U, let x be an 

arbitrary element of U. Then by the definition of U, x is an upper 

bound for B, so bRx. Since x was arbitrary, this shows that b is a 

lower bound for U. 

Hint: Suppose x is the greatest lower bound of U. First use part (b) to 

show that x is an upper bound for B, and therefore x E U. Then use 

the fact that x is a lower bound for U to show that x is the smallest 

element of U — in other words, it is the least upper bound of B. 


(a) Suppose (x, y) € S. Then either (x, y) € R or (x, y) € R'“. If (x, 
y) E R, then (y, x) E R£, so (y, x) E S. If (x, y) E Rt, then (y, 
x) E R, so (y, x) E S. Therefore S is symmetric. Since S = R U 
R}, it is clear that R & S. 

Suppose T is a symmetric relation on A and R & T. To show that S © 

T, let (x, y) be an arbitrary element of S. Then either (x, y) E R or (x, 

y) E R. If (x, y) E R, then since R © T, (x, y) € T. If (x, y) E R+, 

then (y, x) E R, so since R € T, (y, x) E T. But T is symmetric, so it 

follows that (x, y) E T. 

(a) First, note that R4 © R and R, € R. It follows, by exercise 26, 
that S4 © S and S, S S, so S4 U S, © S. For the other direction, 
note that R = R; U R, S S} U Ss, and by exercise 13(b) of 
Section 4.3, S4 U S, is symmetric. Therefore, by exercise 24(b), 
S&S, U Sp. 

Imitating the first half of the proof in part (a), we can use exercise 26 

to show that T} U T, S T. However, the answer to exercise 13(c) of 

Section 4.3 was no, so we can’t imitate the second half of the proof. 


In fact, the example given in the solution to exercise 13(c) works as 
an example for which T; U T, # T. 


Section 4.5 


1. 


10. 


LS, 
16. 


Here is a list of all partitions: 


{{1,2, 3}} 
{{1,2}, 3H 
{{1,3}, {2} 
{{2,3}, {1}} 
{{1}, {2}, (3}} 


(a) R is an equivalence relation. There are 26 equivalence classes — 
one for each letter of the alphabet. The equivalence classes are: 
the set of all words that start with a, the set of all words that start 
with b, ..., the set of all words that start with z. 

S is not an equivalence relation, because it is not transitive. 

T is an equivalence relation. The equivalence classes are: the set of all 

one-letter words, the set of all two-letter words, and so on. For every 

positive integer n, if there is at least one English word of length n, 

then the set of all words of length n is an equivalence class. 


The assumption that is needed is that for every date d, someone was 
born on the date d. What would go wrong if, say, just by chance, no 
one was born on April 23? Where in the proof is this assumption 
used? 


Since S is the equivalence relation determined by F, the proof of 
Theorem 4.5.6 shows that A/S = F = A/R. The desired conclusion now 


follows from exercise 9. 

See Lemma 7.3.4. 

By exercise 16(a) of Section 3.5, _J( FUG) = (LJ PU(U G) = AUB. 
see that F U G is pairwise disjoint, suppose that X E FUG, Y E Fu 


G, and X n Y z ©. If X E Fand Y €E Gthen X CA and Y & B, and 


since A and B are disjoint it follows that X and Y are disjoint, which is 
a contradiction. Thus it cannot be the case that X E F and Y € G, and 


a similar argument can be used to rule out the possibility that X E G 


and Y € F. Thus, X and Y are either both elements of F or both 


elements of G. If they are both in F, then since F is pairwise disjoint, 
X = Y. A similar argument applies if they are both in G. Finally, we 
have VX E F(X # ©) and VX E G(X # ©), and it follows by exercise 
8 of Section 2.2 that VX E FU G(X z Ø). 


20. (a) Here is the proof of transitivity: Suppose (x, y) E T and (y, z) E 
T. Then since T= R ^ S, (x, y) E R and (y, z) E R, so since R is 
transitive, (x, z) E R. Similarly, (x, z) E S, so (x,z E RN S =T. 

(b) Suppose x € A. Then for all y E A, 

y € [x]r iff (y.x) Ee T iff (y, x) Ee RA (y,x)ES 
iff y € [x]gR A y € [x]s iff y € [x]gR A [x]ş. 
(c) Suppose X € A/T. Then since A/T is a partition, X # ©. Also, for 
some x E A, X = [x]; = [Xlp N^ [x]s, so since [x] E A/R and [x]; © 
A/S, X E (A/R) ; (A/S). 
Now suppose X € (A/R) - (A/S). Then for some y and z in A, X = 
[Lyle N [z];. Also, X # ©, so we can choose some x € X. Therefore x 
€ [y]g and x € [z]<, and by part 2 of Lemma 4.5.5 it follows that 
[x] = Lyle and [x], = [z]<. Therefore X = [x]p N [x]s = [x]; E A/T. 
22. FO F={R* x R*, R x R*, R- x R7, R* x R7, R* x{0}, R x {0}, 


{0} x R*, {0} x R7, {(0, 0)}}. In geometric terms these are the four 


quadrants of the plane, the positive and negative x-axes, the positive 
and negative y-axes, and the origin. 


24. (a) Hint: Let T= {(X, Y) E A/S x A/S | dx E XẸJy E Y(xRy)}. 
(b) Suppose x, y, x’, y E A, xSx', and ySy'. Then [x], = [x']s and [y]s = 
[y']s, so xRy iff [x]s T [y]s iff [x']s T Ly']s iff x’ Ry’. 


Chapter 5 


Section 5.1 
1. (a) Yes. 
(b) No. 
(c) Yes. 

3. (a) f(a) = b, f(b) =}, f(c) =a. 

(b) f(2) = 0. 

(c) f(t) =3 and f(-n) = -4. 

5. LeH:N > N, and for every n E N, (L ° H)(n) = n. Thus, L ° H = iy. 
H° L: C > C, and for every c E C, (H ° L)(c) = the capital of the 
country in which c is located. 

7. (a) Suppose that c E C. We must prove that there is a unique b € B 

such that (c, b) E f T C. 

Existence: Let b = f(c) E B. Then (c, b) € f and (c, b) E C x B, 
and therefore (c, b) E f n (C x B) =f I C. 

Uniqueness: Suppose that (c, b,) E f T C and (c, bb) E f I C. 
Then (c, b,) E f and (c, b») E f, so since fis a function, b4 = bo. 

This proves that f f C is a function from C to B. Finally, to derive 
the formula for (f f C)(c), suppose that c € C, and let b = f(c). We 
showed in the existence half of the proof that (c, b) E f Ì C. It 
follows that 


f(c)=b=(f [ C)(c). 


(b) (>) Suppose g =f I C. Then g = f ^n (C x B), so clearly g E f. (~) 
Suppose g € f. Suppose c € C, and let b = g(c). Then (c, b) E g, so 
(c, b) E f, and therefore f(c) = b. But then by part (a), (f t C)(c) = 
f(c) = b = g(c). Since c was arbitrary, it follows by Theorem 5.1.4 
that g=f IC. 


(c) 


10. 


13. 


(b) 
15. 


(b) 


(c) 


19. 


(b) 


ht Z=hn(ZxR={x, y) ERxR|y=2x+3}n(Z x R) = {(, 
yyEZxR|y=2x+ 3} =. 


Since f # g, by Theorem 5.1.4 we can choose some a € A such that 
f(a) # g(a). Therefore (a, f(a)) © f and (a, f(a)) € g, so by the 
definition of symmetric difference, (a, f(a)) E f A g, and similarly (a, 
g(a)) E f Ag. Since f(a) # g(a), it follows that f A g is not a function. 
(a) Suppose b € B. Since Dom(S) = B, we know that there is some 
c E C such that (b, c) E S. To see that it is unique, suppose that 
c' E Cand (b, c’) E S. Since Ran(R) = B, we can choose some a 
€ A such that (a, b) E R. But then (a, c) E Se Rand (a, c) E S 
o R, and since S ° R is a function, it follows that c = c’. 
A= {1}, B= {2, 3}, C= {4}, R= {(1, 2), (1, 3)}, S = {(2, 4), G, 4}. 
(a) No. Example: A= {1}, B= {2, 3}, f= {(1, D}, R= {(1, D}. 
Yes. Suppose R is symmetric. Suppose (x, y) © S. Then we can 
choose some u and v in A such that f(u) = x, f(v) = y, and (u, v) E R. 
Since R is symmetric, (v, u) E R, and therefore (y, x) E S. 
No. Example: A = {1, 2, 3, 4}, B = {5, 6, 7}, f= {(1, 5), (2, 6), (3, 6), 
(4, 7)}, R = {(1, 2), (3, 4}. 
(a) Leta = 3andc =8. Then for any x > a = 3, 


| f(x)| = |7x +3] = 7x +3 < 7x +x = 8x < 8x? =cle(x)). 


This shows that f E O(g). 

Now suppose that g E O(f). Then we can choose a E€ Z* and c € 
R* such that Vx > a(|g(x)| < clf(x)|), or in other words, Vx > a(x? < 
c(7x + 3)). Let x be any positive integer larger than both a and 10c. 
Multiplying both sides of the inequality x > 10c by x, we can 
conclude that x? > 10cx. But since x > a, we also have x? < c(7x + 3) 
< c(7x + 3x) = 10cx, so we have reached a contradiction. Therefore g 
E Off). 

Clearly for any function f E F we have Vx E Z* (|f(x)| < 1- |f(@))), so 
f © O(f), and therefore (f, f) E S. Thus, S is reflexive. To see that it is 
also transitive, suppose (f, g) E S and (g, h) E S. Then there are 
positive integers a, and a, and positive real numbers c, and c, such 


that Vx > a, (|f(@)| < c \g(x)|) and Vx > dy (|g(x)| < Co |h(x)|). Let a be 
the maximum of a, and a», and let c = c4 cy. Then for all x > a, 


| f(x)| < ceilg] < eye2|h(x)| = chx). 


Thus, (f, h) E S, so S is transitive. Finally, to see that S is not a partial 
order, we show that it is not antisymmetric. Let f and g be the 
functions from Z* to R defined by the formulas f(x) = x and g(x) = 


2x. Then for all x E Z*, |f(x)| < |g(x)| and |g(x)| < 21x], so f E O(g) 


and also g E O(f). Therefore (f, g) E S and (g, f) E S, but f# g. 
(c) Since f, E O(g), we can choose a, € Z* and c} € R* such that Vx > 


ay (If, (X)| < c lg(x)|). Similarly, since f, E O(g) we can choose a, E 
Z* and cy E R* such that Vx > a, (lf> | < Co |g). Let a be the 
maximum of a, and dy, and let c = |s|c, + |t}cy + 1. (We have added 1 
here just to make sure that c is positive, as required in the definition 
of O.) Then for all x > a, 
F(x) = Isfilx) HRO S sA + [ell A2@)| 
S Isleilg@)| + Itlealg@)| = (Isler + Itle2)|g@)| s elg). 


Therefore f E O(g). 
21. (a) Hint: Let h = {(X, y) E A/R x B | Ax E X(f(x) = y)}. 
(b) Hint: Use the fact that for all x and y in A, if xRy then [X]p = [y]p. 


Section 5.2 


2. (a) fis nota function. 


(b) fis not a function. g is a function that is onto, but not one-to-one. 
(c) Ris one-to-one and onto. 


5. (a) Suppose that x; E A, x, E A, and f(x,) = Kx-). Then we can 
perform the following algebraic steps: 


xy +1 xo + | 


xı — l E m=] 
(xi + 1x2 — 1) = (x2 + 1)(x1 — 1), 


XX2 — Xi $x — l = xa — X02 aO |, 
2x2 = 2x1, 
X2 = X|. 


This shows that f is one-to-one. 
To show that fis onto, suppose that y € A. Let 


yt+l 


y-l 


Notice that this is defined, since y # 1, and also clearly x #1, sox E 
A. Then 


(b) Forany x €A, 


(fo p) =z ' = = =x = ia (x). 
x-l x=] 


9. (a) {1, 2, 3, 4}. 

(b) fis onto, but not one-to-one. 

13. (a) Suppose that f is one-to-one. Suppose that c4 E C, cy E C, and 
(f t Cc, = (f t O(c). By exercise 7(a) of Section 5.1, it 
follows that f(c,) = f(c>), so since fis one-to-one, c4 = Co. 

(b) Suppose that f  C is onto. Suppose b € B. Then since f I C is onto, 

we can choose some c €E C such that (f | C)(c) = b. But then c € A, 
and by exercise 7(a) of Section 5.1, f(c) = b. 

(c) LetA=B=R and C = R*. For (a), use f(x) = |x|, and for (b), use f(x) 

=x, 

17. (a) Suppose R is reflexive and f is onto. Let x E B be arbitrary. 
Since f is onto, we can choose some u € A such that f(u) = x. 


(b) 


20. 


(b) 


Since R is reflexive, (u, u) © R. Therefore (x, x) E S. 


Suppose R is transitive and f is one-to-one. Suppose that (x, y) E S 
and (y, z) E S. Since (x, y) E S, we can choose some u and v in A 
such that f(u) = x, f(v) = y, and (u, v) E R. Similarly, since (y, z) E S 
we can choose p and q in A such that f(p) = y, f(q) = z, and (p, q) © 
R. Since f(v) = y = f(p) and f is one-to-one, v = p. Therefore (v, q) = 
(p, q) E R. Since we also have (u, v) E R, by the transitivity of R it 
follows that (u, q) E R, so (x, z) E S. 

(a) Letb € B be arbitrary. Since f is onto, we can choose some a © 
A such that f(a) = b. Therefore g(b) = (g ° f)(a) = (he f(a) = 
h(b). Since b was arbitrary, this shows that Vb € B(g(b) = h(b)), 
sog =h. 

Let c4 and c, be two distinct elements of C. Suppose b € B. Let g and 

h be functions from B to C such that Vx E B(g(x) = c,), Vx E B \ {b} 

(h(x) = cı), and h(b) = c. (Formally, g = B x{c,} and h = [(B \{b}) 

x{c,}] U{(b, c2)}.) Then g # h, so by assumption g ° f # h ° f, and 

therefore we can choose some a € A such that g(f (a)) 4 h(f (a)). But 

by the way g and h were defined, the only x E B for which g(x) # 

h(x) is x = b, so it follows that f(a) = b. Since b was arbitrary, this 

shows that f is onto. 


Section 5.3 


Rt (p) = the person sitting immediately to the right of p. 
Let g(x) = (3x — 5)/2. Then for any x € R, 


2(3x —5)/2+5 3x-54+5 3x 
f(g(x) = a = a = 


and 


3(2x + 5)/3 -5 2x4+5-5 2x 
-— og 9. Qo” 


i 


a(f(x)) = 


Therefore f ° g = ip and g ° f = ig, and by Theorems 5.3.4 and 5.3.5 it 


follows that f is one-to-one and onto and f t = g. 


11. 


(b) 
(c) 
14, 


(b) 


16. 


18. 


f(x) =2-log x. 

Suppose that f: A > B, g: B > A, and f ° g = ig. Let b be an arbitrary 

element of B. Let a = g(b) E A. Then f(a) = f(g(b)) = (f ° g)(b) = ig (b) 

= b. Since b was arbitrary, this shows that f is onto. 

(a) Suppose that f is one-to-one and f ° g = ig. By part 2 of Theorem 
5.3.3, fis also onto, so f t: B > A and f t ° f = i4. This gives us 


enough information to imitate the reasoning in the proof of 
Theorem 5.3.5: 


g=iaog=(f'of)og=f'o(fog)=f7'oig= f7. 
Hint: Imitate the solution to part (a). 
Hint: Use parts (a) and (b), together with Theorem 5.3.3. 
(a) Suppose x € A’ = Ran(g). Then we can choose some b € B such 
that g(b) = x. Therefore (g Ax) = g(f (g(4))) = g((F °g)()) = gliz 
(b)) = g(b) = x. 
By the given information, (f 1 A’) ° g = ig, and by part (a), g° (f I 
A’) = ia. Therefore by Theorem 5.3.4, f I A’ is a one-to-one, onto 
function from A’ to B, and by Theorem 5.3.5, g = (f t A’) t. 
Hint: Suppose x E R. To determine whether or not x E Ran(f), you 
must see if you can find a real number y such that f(y) = x. In other 
words, you must try to solve the equation 4y — y? = x for y in terms of 
x. Notice that this is similar to the method we used in part 1 of 
Example 5.3.6. However, in this case you will find that for some 


values of x there is no solution for y, and for some values of x there is 
more than one solution for y. 


Since g is one-to-one and onto, g !:C > B. Let h = gt ° f. Then h: A 
> Band 


goh=go (g7! of) 
= 70 yk © 7 
= 8 S . jaj (Theorem 4.2.5) 
=icof (Theorem 5.3.2) 


= f (exercise 9 of Section 4.3). 


Section 5.4 


1. 
(b) 
(c) 
(d) 

A 

T 


(b) 


(c) 


12, 
(b) 
14. 
a7. 


(b) 
(c) 


(a) No. 
Yes. 
Yes. 
No. 
{-1, 0, 1, 2}. 
Suppose C & A and C is closed under f. Suppose x E A \ C, sox E A 
and x € C. Then ft (x) € A. Suppose f t (x) E C. Then since C is 
closed under f, x = ft (x)) E C, which is a contradiction. Therefore 
f! (x) € C, so ft (x) E A\ C. Since x was an arbitrary element of A \ 
C, this shows that A \ C is closed under f 
(a) Suppose x E C, U C3. Then either x E G} or x E Cy. 
Case 1. x E C}. Then since C, is closed under f, f(x) E C4, so f(x) 
E C, U CG. 
Case 2. x E C,. Then since C, is closed under f, f(x) E Cs, so f(x) 
E G U G. 
Therefore f(x) E C, U C,. Since x was arbitrary, we can conclude 
that C4 U G, is closed under f. 
Yes. Proof: Suppose x E C, N C,. Then x E C} and x E C). Since x 
E C and G; is closed under f, f(x) © C,. Similarly, f(x) © CG. 
Therefore f(x) E C, N C5, so since x was arbitrary, C4} N C, is closed 
under f. 
No. Here is a counterexample: A = {1, 2}, f= {(1, 2), (2, 2)}, C4 = {1, 
2}, Cy = {2}. 
(a) Z. 
{X SN | X is finite}. 
Z. 
(a) Yes. 
Yes. 
Yes. 


(d) No. (The composition of two strictly decreasing functions is strictly 
increasing.) 
20. (b) and (e) are closed under f. 


Chapter 6 


Section 6.1 


1. Base case: When n = 0, both sides of the equation are 0. 
Induction step: Suppose that n © N and 0+1+2+- - -+n = n(n+1)/2. 
Then 


OF14+24+---4¢(n4+1)=O+14+24+:---+n)+ (24+ 1) 


n(n + 1) 


= 5 +(n+ 1) 


i | 2 
(n+1)(= +1) = Ta, 


2 


as required. 
3. Base case: When n = 0, both sides of the equation are 0. 
Induction step: Suppose n € N and 0? + 1° + 2° + - - - +n? = [n(n + 
1)/2]*. Then 


CPP +a = (PHPP 4---403) 44) 


> 


— 


l 2 
_ E | Lint? 


Tn? 
= (n+ 1) |— +n+4+ |1 
n Jk n | 


n?+4n+4 
aa 
B [e+ Kta 

5 


=(n+1)?. 


7. Hint: The formula is (3"*! - 1)/2. 
10. Base case: When n = 0, 9” —- 8n- 1 = 0 = 64-0, so 64 | (9” - 8n - 1). 
Induction step: Suppose that n E N and 64 | (9" — 8n - 1). Then 
there is some integer k such that 9” — 8n — 1 = 64k. Therefore 


Pt _ 8(in +1) -—1=9"t' —8n-—9 
= 9"t! _ 72n — 9 + 64n 
= 9(9” — 8n — 1) + 64n 
= 9(64k) + 64n 
= 64(9k + n), 
so 64 | (9*1 - 8(n + 1) - 1). 
12. (a) Base case: When n = 0, 7” - 5” = 0 = 2 ; 0, so 7” — 5” is even. 
Induction step: Suppose n € N and 7” — 5” is even. Then there is 
some integer k such that 7” — 5” = 2k. Therefore 
qt = grt =7 . 7” -5 . 5” =% 7" +5 : (7" _ 5") 
= 2.7" +5-2k =2(7" + 5k), 
so 711 — 5n*1 is even. 


(b) For the induction step, you might find it useful to complete the 
following equation: 2-7"+! —3.5"*!41]=2-7"—3-5"4+1+42. 


15. Base case: When n = 10, 2” = 1024 > 1000 = n°. 
Induction step: Suppose n > 10 and 2" > n?. Then 


antl — 7.92” 
3 r a n 
> 2m (inductive hypothesis) 
= n> -+ n? 
3 2 . 
> n’ + 10n- (since n > 10) 


= n? +3n* +7n? 
> n? + 3n? +70n (since n > 10) 
= n? + 3n? + 3n + 67n 
> n? +3n? +3n + 1 = (n + 1%. 
20. (a) Base case: When n = 1, the statement to be proven is 0 < a < b, 
which was given. 


Induction step: Suppose that n > 1 and 0 < a” < b”. Multiplying 
this inequality by the positive number a we get 0 < a”*t < ab”, and 


multiplying the inequality a < b by the positive number b” gives us 
ab" < b"*!, Combining these inequalities, we can conclude that 0 < 
qitl < prt, 

(b) Hint: First note that ’/a and %/) are both positive. (For n odd, this 
follows from exercise 19. For n even, each of a and b has two nth 
roots, one positive and one negative, but ~/a and %/} are by definition 
the positive roots.) Now use proof by contradiction, and apply part 
(a). 

(c) Hint: The inequality to be proven can be rearranged to read a"*! — ab” 
— ba" + b"*! > 0. Now factor the left side of this inequality. 

(d) Hint: Use mathematical induction. For the base case, use the n = 1 
case of part (c). For the induction step, multiply both sides of the 
inductive hypothesis by (a + b)/2 and then apply part (c). 


Section 6.2 


1. (a) We must prove that R’ is reflexive (on A’), transitive, and 
antisymmetric. For the first, suppose x € A’. Since R is reflexive 
(on A) and x E A, (x, x) E R, so (x, x) E Rn (A’ x A’) = R’. This 
shows that R’ is reflexive. 

Next, suppose that (x, y) E R' and (y, z) E R’. Then (x, y) E R, (y, 
z) E R, and x, y, z E A’. Since R is transitive, (x, z) E R, so (x, z) © 
R N (A' x A') = R'. Therefore R' is transitive. 

Finally, suppose that (x, y) E R' and (y, x) E R'. Then (x, y) E R 
and (y, x) E R, so since R is antisymmetric, x = y. Thus R' is 
antisymmetric. 

(b) To see that T is reflexive, suppose x € A. If x = a, then (x, x) = (a, a) 
€ {a} x ACT. If x ž a, then x € A’, so since R' is reflexive, (x, x) E 
RCT CT. 

For transitivity, suppose that (x, y) E T and (y, z) E T. If x = a then 
(x, z) = (a, z) E {a} x A S T. Now suppose x # a. Then (x, y) € {a} 
x A, so since (x, y) E T= T' U ({a} x A) we must have (x, y) E T. 
But T € A’ x A’, so y E A’ and therefore y # a. Similar reasoning 
now shows that (y, z) E T'. Since T is transitive, it follows that (x, z) 
ETGT. 


To show that T is antisymmetric, suppose (x, y) E T and (y, x) E 
T. If x = a then (y, x) € T', so (y, x) E {a} x A and therefore y = a = 
x. Similarly, if y = a then x = y. Now suppose x # a and y # a. Then 
as in the proof of transitivity it follows that (x, y) E T and (y, x) E 
T', so by antisymmetry of T’, x = y. 

We now know that T is a partial order. To see that it is total, 
suppose x E A and y E A. If x = a then (x, y) E {a} x ACT. 
Similarly, if y = a then (y, x) E T. Now suppose x # a and y ~ a. 
Then x € A’ and y € A’, so since T is a total order, either (x, y) E T 
€ Tor(y,x)ET ST. 

Finally, to see that R S T, suppose that (x, y) E R. If x = a then (x, 
y) E {a}xA & T. Now suppose x ~ a. If y = a then the fact that (x, y) 
E R would contradict the R-minimality of a. Therefore y # a. But 
then (x,y) ERN (A XA)=R SCT CT. 

(a) We will prove the statement: Vn > 1VB € A[B has n elements > 
dx E BVy €E B((x, y) E R ° R)]. We proceed by induction on n. 


Base case: Suppose n = 1. If B © A and B has one element, then 
for some x €E B, B = {x}. Since R is reflexive, (x, x) © R, and 
therefore (x, x) E Ro R. But x is the only element in B, so Vy € 
B((x, y) E Re R), as required. 

Induction step: Suppose that n > 1 and for every B S A, if B has n 
elements then dx E BVy E B((x, y) E Re R). Now suppose that B © 
A and B has n + 1 elements. Choose some b € B, and let B' = B \{b}. 
Then B' € A and P' has n elements, so by the inductive hypothesis 
there is some x E B' such that Vy E B’ ((x, y) E R ° R). We now 
consider two cases. 

Case 1: (x, b) E Re R. Then Vy E B((x, y) E R ° R), so we are 
done. 

Case 2: (x, b) € Re R. In this case, we will prove that Vy € B((b, 
y) © R ° R). To do this, let y E B be arbitrary. If y = b, then since R is 
reflexive, (b, b) E R, and therefore (b, y) = (b, b) © Re R. Now 
suppose y ~ b. Then y € B’, so by the choice of x we know that (x, y) 
E R ° R. This means that for some z E A, (x, z) E R and (z, y) E R. 
We have (x, z) E R, so if (z, b) E R then (x, b) E R ° R, contrary to 
the assumption for this case. Therefore (z, b) Æ R, so by the 


hypothesis on R, (b, z) E R. But then since (b, z) E R and (z, y) E R, 
we have (b, y) E R ° R, as required. 
(b) Hint: Let A = B = the set of contestants and let R = {(x,y) EA xX A |x 
beats y} U i,. Now apply part (a). 
8. (a) Letm = (a + b)/2, the arithmetic mean of a and b, and let d = (a 
— b)/2. Then it is easy to check that m + d = a and m - d = b, so 


[2 > > +E 
vab = m +d)(m — d) = y m4 — d4 < y mf = m = = A A 


(b) We use induction on n. 
Base case: n = 1. This case is taken care of by part (a). 
Induction step: Suppose n = 1, and the arithmetic mean—geometric 
mean inequality holds for lists of length 2”. Now let a4, a2, ..., 
d>„+1 be a list of 2"* positive real numbers. Let 


Qy +2 Ft- + apn amyl + 2042 Hee Ant 
n2=————— 


ny, = — o, = 
on = qn 


Notice that ay t dy ee a an = My a and similarly d9n44 + dpn49 + 
*+dyn+1 = M2". Also, by the inductive hypothesis, we know that 


mı > ajaz = ay and mz > 7 Gg 4. 19H 49 Aant. Therefore 
A) + A2 + +++ + ayns m2” + m22" mı +m? n 
oo mh ah a vmm? 


> J Waas s.. pn Hampan? oop Aon+l 
— gnt Jaia sae Agn+1 . 
(c) We use induction on n. 
Base case: If n = no, then by assumption the arithmetic mean- 


geometric mean inequality fails for some list of length n. 
Induction step: Suppose n = no, and there are positive real numbers 


a4, 4>,..., A, such that 


a, +d2+°:++an i 
— < 4l. 


n 


Let m = (a; + a +: + + +a,)/n, and let a,,, = m. Then we have 
m < ¿faja an, SO M” < a, A3 ` ` + ap. Multiplying both sides of this 


inequalityby m gives us m”*! < a, a, +++ ap M = q} A> * ` © Gy44, SO 
m < "Yaira -n41 But notice that we also have mn = a, + ap +: +> 
+d,, SO 


ay +: + anyi mn+m  m(n+1) 


n+l 
= = ———— =m < "NV4a1A2---än4l. 
n+l n+l n+l 


Thus, we have a list of length n + 1 for which the arithmetic mean- 
geometric mean inequality fails. 

(d) Suppose that the arithmetic mean—geometric mean inequality fails for 
some list of positive real numbers. Let ng be the length of this list, 
and choose an integer n > 1 such that ng < 2”. (In fact, we could just 
let n = No, as you will show in exercise 12(a) in Section 6.3.) Then by 
part (b), the arithmetic mean—geometric mean inequality holds for all 
lists of length 2”, but by part (c), it must fail for some list of length 
2". This is a contradiction, so the inequality must always hold. 

10. (a) Hint: Show that (a, b; + a, by) — (a, by + a, by) = O. 

(b) Use induction on n. For the induction step, assume the result holds 
for sequences of length n, and suppose a, < a; <°*+<d,<dy11,), < 
by <--> <b, < bp+1, and f is a one-to-one, onto function from {1, 2, . 
.., n+ 1} to itself. Now consider two cases. For case 1, assume that 
f(n + 1) = n + 1, and use the inductive hypothesis to complete the 
proof. For case 2, assume that f(n + 1) < n + 1. Find a one-to-one, 


onto function g from {1, 2,...,n+ 1} to itself such that g is almost 
the same as f but g(n + 1) =n + 1, and show that 


aibfa e+ + ang 1b ping.) = abea) +++ + n+ be(n41) 


< aibi +--+ anl bn4t. 


11. We proceed by induction on n. 


Base case: n = 0. If A has 0 elements, then A = Ø, so A(A) = {Ø}, 
which has 1 = 2° elements. 


Induction step: Suppose that for every set A with n elements, A(A) 
has 2” elements. Now suppose that A has n+1 elements. Let a be any 
element of A, and let A’ = A \ {a}. Then A’ has n elements, so by the 
inductive hypothesis Y(A') has 2” elements. There are two kinds of 
subsets of A: those that contain a as an element, and those that don’t. 
The subsets that don’t contain a are just the subsets of A’, and there 
are 2” of these. Those that do contain a are the sets of the form X 
U{a}, where X E A(A’), and there are also 2” of these, since there 
are 2” possible choices for X. Thus the total number of elements of 
P(A) is 22+ 22 =Qm1 


14. Base case: n = 1. One chord cuts the circle into two regions, and (n* + 


n+ 2)/2 = 2. 

Induction step: Suppose that when n chords are drawn, the circle is 
cut into (n? +n+2)/2 regions. When another chord is drawn, it will 
intersect each of the first n chords exactly once. Therefore it will 
pass through n+1 regions, cutting each of those regions in two. (Each 
time it crosses one of the first n chords, it passes from one region to 
another.) Therefore the number of regions after the next chord is 
drawn is 


n7+n+2 n? +3n+4  (n+1)?+(n+1)+2 


= +(n+1)= 5 = 


as required. 


Section 6.3 


1. Hint: The formula is 


6. Base case: n = 1. Then 


n 

l 5 l 
) ==I1<1=2--. 
a t“ n 
i=l 


Induction step: Suppose that 


Then 


= 


n+ 


l 
ye E ee E 
i2 i? s -= n (n+1) 


i=] 


p _mtntl n +n a ] 
n(n + 1} n(n + 1)2 n+1 


8. (a) We let m be arbitrary and then prove by induction that for all n = 
m, H, — Hp 2 (n - m)/n. 
Base case: n = m. Then H,- Hn = 0 2 0 = (n - m)/n. 
Induction step: Suppose that n > m and H, — Hm 2 (n — m)/n. Then 
l n—m l 


H — Hap =H — — Hp > 
n+l m arar i 7 tari 


_ rom l o on+l-m 
~ n+l n+l n+l 


(b) Base case: If n = 0 then Hyn = H} = 1 2 1 = 1 + n/2. 
Induction step: Suppose n = 0 and H5n = 1 + n/2. By part (a), 


n+l _ on 


| 
P — : > —— = —, 
Hyny — Hx: > mF : 


Therefore 


— 
~ 
~ 


Hyni > Hn + 5 > l + 
(c) Since lim, œ (1+n/2) = œ, by part (b) lim,- Hon = ©. Clearly the 


H,, ’s form an increasing sequence, so lim, — o H, = ©. 


12. (a) Hint: Try proving that 2” > n + 1, from which the desired 
conclusion follows. 


(b) Base case: n = 9. Then n! = 362880 > 262144 = (2”)?. 
Induction step: Suppose that n > 9 and n! > (2”)*. Then 


(n +1)! =(n+1)-n!> (n+ 1)- 2")? > 10-27%" > 27.27" 


— 92n+2 


(2"+! 2. 


(©) Base case: n = 0. Then n! = 1 < 1 = 20°, 


i 2 
Induction step: Suppose that n! < 2°"), Then 
7((n-+1)*) _ on? +2n+l _ a(n?) _a2n+l gln?) _on+l 


>n!-(n+1) (by inductive hypothesis and part (a)) 
=(n+1)!. 


15. Base case: n = 0. Then a, = dy = 0 = 2°-0-1=2"-n-1. 


Induction step: Suppose that n E N anda, = 2” - n — 1. Then 


Qn+1 = 2an +n = 2(2" —n—1)4+n 


= 2"tl __9n ~-24n=2"t! ~ 2 -2=2"t! ~(n +1) 1. 


18. (a) (9) =n!/@!n!) = Land (p) =n! /(n!0!) = 1. 


(b) (;) ( n )= n! n! 
k . k-1 =m- &-Din—-k+ D! 


= ni(n—k+1) nik 
~ ki(n—k+1)! k!'(n-—k+1)! 


B n! (n+ 1) E n+l 
kinti- ky 


(c) We follow the hint. 


Base case: n = 0. Suppose A is a set with 0 elements. Then A = ©, 
the only value of k we have to worry about is k = 0, 2% (A) = {OP}, 
which has 1 element, and (3) = |. 


Induction step: Suppose the desired conclusion holds for sets with 
n elements, and A is a set with n + 1 elements. Let a be an element of 
A, and let A’ = A \ {a}, which is a set with n elements. Now suppose 
0<k<n+1. We consider three cases. 

Case 1: k = 0. Then (A) = {©}, which has 1 element, and 


(Tt) = l. 


Case 2:k=n+ 1. Then #, (A) = {A}, which has 1 element, and 
wa =i. 


Case 3. 0 < k < n. There are two kinds of k-element subsets of A: 
those that contain a as an element, and those that don’t. The k- 
element subsets that don’t contain a are just the k-element subsets of 
A’, and by the inductive hypothesis there are (%) of these. Those that 
do contain a are the sets of the form X U {a}, where X E 2- (A), 
and by the inductive hypothesis there are (,",) of these, since this is 
the number of possibilities for X. Therefore by part (b), the total 
number of k-element subsets of A is 


(i) +( n J=") 
k k-1/) \k J 
(d) We let x and y be arbitrary and then prove the equation by induction 
onn. 
Base case: n = 0. Then both sides of the equation are equal to 1. 
Induction step: We will make use of parts (a) and (b). Suppose that 


n > k 
(x + y)” — > Hai 


k=0 


Then 


(x + y)"t! = (x + y)(x + y)" 


n 
= (x + y) > Pci (inductive hypothesis) 


k=0 


as f i N\ n N \ n-i : d n-2 2 
= (x + y) (o) + (i) y+ (3) y 


n+l n+l 
=( 0 J> + 
4...4 n+l ry" + n+l y” +l 
n J’ nij“ 


n+l 


= y (" T a 
k 


k=0 


20. Hint: Surprisingly, it is easier to prove that for all n > 1, 0 < a„ < 1/2. 


Section 6.4 


1. (a) (>) Suppose that VnQ(n). Let n be arbitrary. Then Q(n + 1) is 
true, which means Vk < n + 1 P(k). In particular, since n < n + 1, 
P(n) is true. Since n was arbitrary, this shows that VnP (n). 
(-) Suppose that VnP (n). Then for any n, it is clearly true that Vk 
< nP(k), which means that Q(n) is true. 


(b) Base case: n = 0. Then Q(n) is the statement Vk < 0 P(k), which is 
vacuously true. 


(b) 


Induction step: Suppose Q(n) is true. This means that Vk < nP(k) 
is true, so by assumption, it follows that P(n) is true. Therefore Vk < 
n+ 1 P(k) is true, which means that Q(n + 1) is true. 4. (a) Suppose 
/6 is rational. Let § = {q € Z+ | 3p € Zt+(p/q = V6)}. Then S # Ø, 
so we can let g be the smallest element of S, and we can choose a 
positive integer p such that p/g = /6. Therefore p? = 6q’, so p° is 
even, and hence p is even. This means that p = 2P, for some integer 
p. Thus 4p? = 6q?, so 2p? = 3q? and therefore 3q? is even. It is easy 
to check that if q is odd then 3q? is odd, so q mustbe even, which 
means that q = 2g for some integer 7. But then v6 = p/g and < q, 
contradictingthe fact that q is the smallest element of S. 

Supposethat /2 + /3 = p/q. Squaring both sides gives us 5 + 
2/6 = p?/q?, so V6 = (p?—5q?)/(2q*), which contradicts part (a). 
(a) We use ordinary induction on n. 
Base case: n = 0. Both sides of the equation are equal to 0. 
Induction step: Suppose that }7"_» F; = Fa+2 — 1. Then 


n+l n 
XOR =) Fi + Fret = (Fn42 — 1) + Fri = Fng3 — 1. 
i=0 i=0 


(b) We use ordinary induction on n. 


(c) 


Base case: n = 0. Both sides of the equation are equal to 0. 
Induction step. Suppose that ¥”?_o(F;)? = Fy, Fa+1. Then 
n+l 


X (hy = as + (Fn41) = Fy Fangi + (Frp)? 


i=0 i=0 
= Fn41(Fn + Fn+1) = Ft Fn42. 


We use ordinary induction on n. 
Base case: n = 0. Both sides of the equation are equal to 1. 
Induction step: Suppose that )~"_9 Fzi+1 = Fon42. Then 


n+l n 
$ Faint = D> Faint + Fone3 = Fong2 + Fans 


i=0 i=0 


= Fon44 = Fon+1)42- 


(d) 
9. 


(b) 


(c) 


II, 
15. 


18. 
21. 


(b) 


The formula is $} o Fz; = Fon41 — 1. 
(a) (>) Suppose ap, ay, do, . . . is a Gibonacci sequence. Then in 
particular a) = dy + a}, which means c° = 1 + c. Solving this 


quadratic equation by the quadratic formula leads to the 
conclusion c = (1 + /5)/2. 


(<) Suppose either ¢ = (1 + J5)/2 orc = (1 — V5)/2. Then c° = 
1 + c, and therefore for every n>2,a,=c"=c"™*c?=c"™*(1+c)= 
Crete OSG. 9S E 
It will be convenient to introduce the notation c} = (1 + /5)/2 and 
c2 =(1—J5)/2. Then for any n>2,a, = sc? +t = scl? + 
tae -_ =O +c) + teh? (1 +c) = (sc? + tes) + 
(sci! + tk!) = An—2 + Gn-1. 
Hint: Let s = (Sap + (2a; — ao) V5)/10 and t = (5a9 — (2a; — 
ag)V/5)/10. 
Hint: The formula is a, = 2:3-3- 2". 
Let a be the larger of 5k and k(k + 1). Now suppose n > a, and by the 
division algorithm choose q and r such that n = qk +r andO<r<k. 
Note that if q < 4 then n = qk + r < 4k + r < 5k < a, whichis a 
contradiction. Therefore q > 4, so q = 5, and by Example 6.1.3 it 
follows that 24 > q*. Similar reasoning shows that q > k + 1, so q? > 
q(k + 1) = qk + q > qk + k > qk + r = n. Therefore 2” > 2% = (21k > 
@yen. 
Hint: The formula is a, = Fro /Frit- 
(a) For any numbers a, b, c, and d, 


(ab)(cd) = (cd)(ab) (commutative law) 
= c(d(ab)) (associative law) 
= c((da)b) (associative law) 


= c((ad)b) (commutative law). 


To simplify notation, we will assume that any product is the left- 
grouped product unless parentheses are used to indicate otherwise. 
We use strong induction on n. Assume the statement is true for 
products of fewer than n terms, and consider any product of a4, a>, .. 


., d,. If n = 1, then the only product is the left-grouped product, so 


there is nothing to prove. Now suppose n > 1. Then our product has 
the form pq, where p is a product of a4, . . . , a,_, and q is a product 


of a,,..., A, for some k with 2 < k < n. By the inductive hypothesis, 
p=a,°** a,_; and q = ag: ` : a, (where by our convention, these two 
products are left-grouped). Thus, it will suffice to prove (a, °° * a,_1) 


(aq, °° + d,) = a,°°° ap If k= n, then the left-hand side of this 
equation is already left-grouped, so there is nothing to prove. If k < n, 
then 


(ai +++ ak—1)(ak +++ An) 


= (a, +++ ak1 )( (ak +++ Ay— Jay) (definition of left-grouped) 


= ((a, ++- ap—1) (4p +++An—1))an (associative law) 
= (a) +++ Gp—1)ay (inductive hypothesis) 
= A] **-an (definition of left-grouped). 


(c) By part (b), we may assume that the two products are left-grouped. 
Thus, we must prove that if b4, bo, . . . , b, is some reordering of aj, 


d, . . . , Ap then q} * : * ap = b4 ¢ + + ba, where as in part (b) we assume 


products are left-grouped unless parentheses indicate otherwise. We 
use induction on n. If n = 1 then the products are clearly equal 
because b} = a}. Now suppose the statement is true for products of 


length n, and suppose that b4, . . . , b,,, is a reordering of aj, . 
a„+1: Then b„+1 is one of a}, . . . , ap+1 IÉ b41 = Apa1 then 


e 


bi +++ bnay = (by +++ bn )anat (definition of left-grouped) 
= (A1 +++ Gn)an+} (inductive hypothesis) 
= j++ an4] (definition of left-grouped). 


Now suppose b,,,, = a, for some k < n. We will write a, --- a +++ ay, 
for the (left-grouped) product of the numbers a4, ..., a, with the 
factor a, left out. Then 


by -+ -bn41 = (b1 - ++ by ag (definition of left-grouped) 
= (aj +++ k- ++ Ana ag (inductive hypothesis) 
= ((a, ---Gk-++@n)an41)ax (definition of left-grouped) 
= (a) -++Ge-++dn)(An41aK) (associative law) 
= (a) +++ Ge +++ ay)(agean4,) (commutative law) 


= ((a) ---e-++ap)ag)ans, (associative law) 


= (A1 +++ Gy )an+1 (inductive hypothesis) 
= a) ++- Anl (definition of left-grouped). 
Section 6.5 
1. B,= {n}. 
4. Bo={O}, B; = {X E AN) |X has exactly one element}, B, = {X E 
IN) | X has either one or two elements}, . . . . In general, for every 


(c) 
10. 


positive integer n, B, = {X E A(N) | X # Ø and X has at most n 
elements}. 


{mE Z]|n2z2}. 
(a) Bo={xER]|-2<x<0},B;={xER|0<x<4}, B, = {x E€ 
R|O<x<16},....In general, for every positive integer n, B, 


= {x ER|0<x< 22}. 


Unen Bn = {x € R | x > —2}. Therefore —1,3 € U,en Bn but 
f(=1,3) = -3 ¢ Unen Bn. 80 Unen Bn is not closed under f. In other 


words, property 2 in Definition 5.4.8 does not hold. 
R. 


We use induction on n. 


Base case: n = 1. Then x = 2! +2 = 4. The only value of i we have to 
worry about is i = 0, and for this value of i we have i + 2 = 2 andx + i 
= 4, Since 2 | 4, we have (i + 2) | (x + i), as required. 

Induction step: Suppose that n is a positive integer, and for every 
integer i, if0 <i<n-1then(i+ 2) |((n+ 1)! +2 + i). Now let x = (n 
+ 2)! +2, and suppose that 0 <i <n. If i = n then we have 


14. 


16. 


(b) 


18. 


x+i¢=(n+2)!4247=(4+2)!4+04+2) = (4+2)((84+ I! 41), 


so (i + 2) | (x + i). Now suppose 0 <i < n — 1. By the inductive 
hypothesis, we know that (i + 2) | ((n + 1)! +2 + i), so we can choose 
some integer k such that (n+1)! +2+i = k(i+2), and therefore (n+1)! = 
(k — 1)(i + 2). Therefore 


x +i = (n+2)!424+7 = (n+ 2)(n + 1)!4+C 4+ 2) 
= (n+2)(k — 1) +2) 4+ 4+2) = (@ +2)((n + 2)(k — 1) 4+ 1), 


so (i+ 2) | (x + i). 
Clearly T is a relation on A and R = Rt! & T. To see that T is transitive, 
suppose (x, y) E T and (y, z) E T. Then by the definition of T, we can 
choose positive integers n and m such that (x, y) E R” and (y, z) E 
R™. Thus by exercise 11, (x, z) © R™ © R” = R™", so 
(x.z) € U,eg+ R” = T. Therefore T is transitive. 

Finally, suppose R S S € A x A and S is transitive. We must show 
that T © S, and clearly by the definition of T it suffices to show that 
Vn € Z* (R" © S). We prove this by induction on n. We have 


assumed R © S, so when n = 1 we have R” = R! = RC S. For the 
induction step, suppose n is a positive integer and R” € S. Now 
suppose (x, y) E R™!. Then by the definition of R’*! we can choose 
some z € A such that (x, z) E R and (z, y) © R”. By assumption R © 
S, and by the inductive hypothesis R” © S. Therefore (x, z) E S and 
(z, y) E S, so since S is transitive, (x, y) © S. Since (x, y) was an 
arbitrary element of R”*!, this shows that R! C S. 
(a) RNS © R and RAS & S. Therefore by exercise 15, for every 
positive integer n, (R n S)” © R” and (R n S)" © S”, so (R n S)" 
S R” n S". However, the two need not be equal. For example, if 
A= {1, 2, 3, 4}, R= {(1, 2), (2, 4)}, and S = {(1, 3), (3, 4)}, then 
(R n S? = Ø but R? n S = {(1, A}. 
R” U S C (R U SP, but they need not be equal. (You should be able 
to prove the first statement, and find a counterexample to justify the 
second.) 


(a) We use induction. 


Base case: n = 1. Suppose (a, b) E R! = R. Let f = {(0, a), (1, b)}. 
Then fis an R-path from a to b of length 1. For the other direction, 
suppose f is an R-path from a to b of length 1. By the definition of R- 
path, this means that f(0) = a, f(1) = b, and (f (0), f)) E R. 
Therefore (a, b) E R= R'. 

Induction step: Suppose n is a positive integer and R” = {(a, b) E€ 
A x A| there is an R-path from a to b of length n}. Now suppose (a, 
b) E R"*! = R! © R” by exercise 11. Then there is some c such that 
(a, c) E R” and (c, b) E R. By the inductive hypothesis, there is an 
R-path f from a to c of length n. Then f U {(n + 1, b)} is an R-path 
from a to b of length n + 1. For the other direction, suppose f is an R- 
path from a to b of length n + 1. Let c = f(n). Then f\{(n+1, b)} is an 
R-path from a to c of length n, so by the inductive hypothesis (a, c) 
€ R”. But also (c, b) = (f (n), Kn + 1)) E R, so (a, b) E R! ° R" = 
R, 

(b) This follows from part (a) and exercise 14. 


Chapter 7 


Section 7.1 

2. (a) gcd(775, 682) = 31 = -7- 775 + 8- 682. 

(b) gcd(562, 243) = 1 = 16 - 562 - 37- 243. 

5. Letn be an arbitrary integer. 

(>) Suppose n is a linear combination of a and b. Then there are 
integers s and t such that n = sa + tb. Since d = gcd(a, b), d | a and d | 
b, so there are integers j and k such that a = jd and b = kd. Therefore n 
= sa + tb = sj d+ tkd = (sj + tk)d, sod | n. 

(<) Suppose d | n. Then there is some integer k such that n = kd. 
By Theorem 7.1.4, there are integers s and t such that d = sattb. 
Therefore n = kd = k(sa + tb) = ksa + ktb, so n is a linear combination 
of a and b. 

7. (a) No. Counterexample: a = b = 2, a’ = 3, b' = 4. 

(b) Yes. Suppose a | a’ and b | b’. Let d = gcd(a, b). Then d | a and d | b. 
Since d | a and a | a’, by Theorem 3.3.7, d | a’. Similarly, d | b'. 
Therefore, by Theorem 7.1.6, d | gcd(a’, b’). 

9. We use strong induction on the maximum of a and b. In other words, 


we prove the following statement by strong induction: 
Vk € Z* [Va € Z*Yb e Z*(max(a, b) = k 
= ocd(24 = 1.2” -1)= ggcd(a,b) = 1)], 
where max(a, b) denotes the maximum of a and b. 


Let k € Z* be arbitrary and assume that for every positive integer 
k' <k, 


Va € Z*Yb € Z*(max(a,b) = k' > ged(2“—1,2?—1) = 28040) — 1), 


Now let a and b be arbitrary positive integers and assume that max(a, 
b) = k. We may assume that a = b, since otherwise we can swap the 
values of a and b. We consider two cases. 


12. 


Case 1. a = b. Then 


ecd(2" _ 1.2? =s ecd(2° —1,.2°—1)=2?-1= gecd(aa) —] 


— pecd(a.b) _ | 


Case 2. a > b. Let c = a-b > 0, so that a = c+b. Let k' = max(c, b). 
Since b <a and c <a, k' < a = max(a, b) = k. Therefore 


gcd(27 — 1,2° — 1) = gcd(2° — 14+ 2% —2°, 2% — 1) 
= gcd(2° — 1 + 2°(2° — 1),2 — 1) 
= gcd(2°—1,2—1) (exercise 6) 
= 28cd(cb) _ | (inductive hypothesis) 
= 28cd(e+bb) _ | (exercise 6) 
= gecd(a,b) a 


(a) gcd(55, 34) = 1. The numbers r; are the Fibonacci numbers. 
There are 8 divisions. 


(b) gcd(F,41, Fp) = 1. There are n — 1 division steps. 


Section 7.2 
2. 14950. 
5. 


(b) 


Suppose some prime number p appears in the prime factorizations of 
both a and b. Then p | a and p | b, so gcd(a, b) = p > 1, and therefore a 
and b are not relatively prime. 

Now suppose a and b are not relatively prime. Let d = gcd(a, b) > 
1. Let p be any prime number in the prime factorization of d. Then 


since d | a and d | b, p must occur in the prime factorizations of both a 
and b. 


Let d = gcd(a, b) and x = ab/ gcd(a, b) = ab/d. 

Since d = gcd(a, b), d | b, so there is some integer k such that b = kd. 
Therefore x = akd/d = ak, so x is an integer and a | x. A similar 
argument shows that b | x, so x is a common multiple of a and b. 
Since m is the least common multiple, m < x. 

Suppose r > 0. Since a | m, there is some integer t such that m = ta. 
Therefore r = ab — qm = ab - qta = (b - qt)a, so a | r. Similarly, b | r. 


(c) 
(d) 
if. 


13. 


16. 


19. 


But r < m, so this contradicts the definition of m as the least positive 
integer that is divisible by both a and b. Therefore r = 0. 

With t defined as in part (b), ab = qm = qta. Dividing both sides by a, 
we get b = qt, so q | b. The proof that q | a is similar. 

Since q | a and q | b, q < gcd(a, b). Therefore ab = qm < gcd(a, b)m, 
so m = ab/gcd(a, b). 

Hint: One approach is to let q and r be the quotient and remainder 
when m is divided by lcm(a, b), and prove that r = 0. 

Let the prime factorization of b be b = p''p>---p;‘. Then the 
factorization of b? is p? = p?*! p3?... pp, Since a? | b’, every prime 
factor of a must be one of pj, Po, . - - , Py SO a = př' pf ++ př for 
some natural numbers fi, fo, . . . , f Therefore a? = p7/'p3!?... pi”. 
Since a? | b’, for every i we must have 2f; < 2e,, and therefore f; < e;. 
Thus a |b. 

Let pj, Po, . . . , Pg be a list of all primes that occur in the prime 


factorization of either a or b, so that 


1 e2 fi fr f 
a = pi P3 + Pk > b = p ' px? e pj 


for some natural numbers e}, €v, . . . , ep and fy, fo,..., fpe For i= 1, 2, 
...,K, let 


ei, ifer> fi, 0, ife; > fi, 
gi = _ h={ .. i 
0, ife; < f; fi, ife; < fi. 
Let 
c= pi py eee P's d = pi py eee py n 


Then for all i, g; < e; and h; < f;, and therefore c | a and d | b. Also, c 
and d have no prime factors in common, so by exercise 5, c and d are 
relatively prime. Finally, 


2) +i; ep +l max(ei, fi) max(ez, fe) 
ily yok the pi ax (e1, fi -Ph ekJk) Icm(a, b). 


cd = pi "+ Dy 


(a) Since x is a positive rational number, there are positive integers 
m and n such that x = m/n. Let d = gcd(m, n). By exercise 9, we 


can let a and b be positive integers such that m = da, n = db, and 

gcd(a, b) = 1. Then 

om da _ a 

~n db b 

(b) Since a/b = c/d, ad = bc. Therefore a | bc. Since gcd(a, b) = 1, by 
Theorem 7.2.2, a | c. A similar argument shows c | a, so a = c. 
Therefore ad = bc = ba, and dividing both sides by a we conclude 
that b = d. 


(c) By part (a), we have x = a/b, where a and b are relatively prime 
positive integers. Let the prime factorizations of a and b be 


BES A. & b= hi hg Ai 
ASI r irj =s s. 


Note that by exercise 5, these factorizations have no primes in 
common. Then 


oi re... rj 
ee - J p8 E „Eja mhi ami amii 
= hin M Te hi = ri ra rj S] S3 Si . 
Sy S3 ee S} 
Rearranging the primes r,,..., rj, Sy, +++ 5 SI into increasing order 
gives the required product p}' -++ pý. 


d) We begin by reversing the steps of part (c). Let r4, r5,..., r; be those 
gin by g ps ot p L l2 j 

primes in the product p}' p3 +-+ p} whose exponents are positive, 

listed in increasing order, and s4, S2, . . . , Sı those whose exponents 


are negative. Rewriting each prime raised to a negative power as the 
prime to a positive power in the denominator, we get 


g1 82 „Sj 
.— nfl „e2 êk 712 rj 
X= Py Py Pk = iia. A’ 

152 °°° > 


where all the exponents g; and h; are positive integers. The numerator 


and denominator have no prime factors in common, so they are 
relatively prime. Similarly, the product g/'g3? .-. gj can be rewritten 


as a fraction with all exponents positive: 


> . . y?! 2 yt 
a. ees ti h Jm — ft v2 vi 
i= di d> -e-m = -F ph "T 


Zu * 
wi ‘ws + Wh 


gj i j » - 
By part (b), r? r $ oe ri = vi! can v" and si re s _ wy! T. we", By the 


uniqueness of prime factorizations, j = t and for alli E {1,...,j}, r; 
= y; and g; = y; and also l = u and for alli E {1,..., I}, s; = w; and 
h; = z;. Rewriting the primes in the denominator as primes raised to 


negative powers, we find that the original two products 


pi!» p@t and qj" --- qi" are the same. 


Section 7.3 


4. (a) Since Z} is an additive identity element, Z,; + Z» = Z). And since 


(b) 


(c) 


(b) 


Z is an additive identity element, Z4 + Zə = Z4. Therefore Z, = 
Z tZ =Z. 
Since Xį is an additive inverse for X, X}+X+X4 = [0]m +X} = X}. 
Similarly, since X% is an additive inverse for X, X} + X + X, = 
Xi + [0] = X}. Therefore X} = X3. 
Suppose O, and O, are multiplicative identity elements. Then O, = 
O1 : O2 = Op. 
Suppose Xí and X4 are multiplicative inverses of X. Then Xi = Xi» 
[lm =X, < X - X5 = [l]m < X5 = X}. 
Let a and b be arbitrary integers. Then 


na = nb (mod nm) iff 3k € Z(nb — na = knm) 


iff 3k e Z(b—a=km) iff a=b (mod m). 


(a) xE [95]o37. 
xE [12]-9. 
Let a and b be arbitrary integers. Suppose first that a = b (mod m). 
Then [a],, = [b]m so [na], = [n]wla]n = [MNnlbl,n = [nb]m and 
therefore na = nb (mod m). 
Now suppose that na = nb (mod m), so [n],, - Lal, = [na]m = [nb]m 
= [n],, : [b], Since m and n are relatively prime, [n],, has a 


multiplicative inverse. Multiplying both sides of the equation 
[n]m la]m = [n]m > [bm by [n]>', we get [a]m = [b],,, so a = b (mod m). 


15. Hint: Prove that if a = b (mod m) then D(m) n D(a) = D(m) n D(b). 
17. (a) First note that 10 = 1 (mod 3), so [10], = [1]3. Therefore [107], 
= [10]5[10]5 = (1]5-[1, = [1], [107], = [10710]; = [Hz [1] 
= [1]3, and, in general, for every i € N, [10' = = [1]. (A more 
careful proof could be done by induction.) Thus 


[n]3 = [do + 10d, +--+ + 10% dk] 
= [do]; + [10]; - [d1]; +--+ + [10*]s - [dels 
= [do]3 + [1]3 « [diJ3 +--- + [1] - [dk]3 
= [do + dı +--+: + ]s. 
In other words, n = (dọ + mal -+ + +d,) (mod 3). 
(b) 3|niff[n], = [0]; iff [dọ +--- +d,1z = [0]; iff 3 | (dọ +: © +d,). 
19. (a) Suppose n > 10. First note that 
lO f(n) = (dy «++ d\0)j9+50do = (d; -- -d do) 19 +49dg = n+49dp. 


Therefore 3f(n)—n = 49d) -7f(n) = 7(7dp —-f(n)), so n = 3f(n) (mod 7), 
or equivalently [n]; = [3], - [f(n)],. Since [3] t; = [5]., it follows that 
[f(n)]7 = [5]; - [n];, so Kn) = 5n (mod 7). 

(b) Suppose n > 10. If 7|n then [n]; = [0],, so [f(n)], = [5n]; = [5]; - [0]; 

= [0]-, and therefore 7 | f(n). Similarly, if 7 | f(n) then [f(n)], = [0], 

so [n]; = [Bfn]; = [3]; - LO], = [0]; and 7 | n. 

(c) (627334) = 62733 + 5- 4 = 62753; f(62753) = 6275 + 5 - 3 = 6290; 
f(6290) = 629 + 5 - 0 = 629; f(629) = 62 + 5 - 9 = 107; f(107) = 10 + 
5- 7 = 45; f(45)=4+5-5=29. Since 7 } 29, 7 + 627334. 


Section 7.4 


2. (a) (539) = 420. 


(b) (540) = 144. 
(c) (541) = 540. 


10. 


13. 
15. 


Suppose a = b (mod mn). Then mn | (b — a), so for some integer k, b — 
a = kmn. Therefore m | (b — a) and n | (b — a), so a = b (mod m) and a 
= b (mod n). 

Now suppose a = b (mod m) and a = b (mod n). Since a = b (mod 
n), n | (b — a), so there is some integer j such that b — a = jn. Since a = 
b (mod m), m | (b — a), so m | jn. But gcd(m, n) = 1, so by Theorem 
7.2.2 it follows that m | j. Let k be an integer such that j = km. Then b 
-a = jn = kmn. Therefore mn | (b — a), so a = b (mod mn). 


The first half of the solution to exercise 6 does not use the hypothesis 
that m and n are relatively prime, so the left-to-right direction of the 
“iff” statement is correct even if this hypothesis is dropped. Here is a 
counterexample for the other direction: a = 0, b = 12, m = 4, n = 6. 

Suppose p is prime and a is a positive integer. We consider two cases. 

Case 1. pła. Then p and a are relatively prime, so by Theorem 
7.4.2, [ay,' = [1]p. Therefore 
[aP]p = [a] - [alp = [1p - [a]p = [a]p. S0 aP = a (mod p). 

Case 2. p | a. Then [a], = [O],,» so [a”]p = [0], = [0], = la], and 
therefore a? = a (mod p). 

Hint: Use Lemma 7.4.6 and induction on k. 
(a) We proceed by induction on k. 

Base case: When k = 1, the statement to be proven is that for every 
positive integer m, and every integer a4, there is an integer r such 
that 1 <r<m, andr =a, (mod m,). This is true because {1, 2,..., 
mı} is a complete residue system modulo m4. 

Induction step: Suppose that the statement holds for lists of k 
pairwise relatively prime positive integers, and let m,, Ms, . . . , M44 
be a list of k + 1 pairwise relatively prime positive integers. Let M' = 
mı ms : + m; and M=m,m),:-: my, = M' my. Let ay, dx,..., 
d,,, be arbitrary integers. By the inductive hypothesis, there is an 
integer r’ such that for alli E {1, 2,..., k}, r' = a; (mod m,). By 
exercise 13, gcd(M’, m,,,) = 1, so by Lemma 7.4.7 there is some 
integer r such that 1 < r < M, r = r' (mod M’), and r = a,,, (mod 


M41). By exercise 14, for every i E {1, 2,...,k}, r = r' (mod m;), 
and therefore r = a; (mod m;). 


(b) Suppose that 1 < r4, rə < Mand for alli E {1, 2,..., k}, r4 = a; (mod 
m;) and r» = a; (mod m;). Then for all i E {1, 2, . . . , k}, r4 = r (mod 
m;), so by exercise 14, r4 = rə (mod M). Therefore r, = rp. 

17. Suppose m and n are relatively prime. Let the elements of D(m) be a4, 
a, . . . , A, and let the elements of D(n) be b4, bo, . . . , b,. Then o(m) 
=q; +a, +: +a,ando(n)= b; +b, +--+ +b. Using the function f 
from part (b) of exercise 16, we see that the elements of D(mn) are all 
products of the form a; bj, where 1 < i < s and 1 < j < t. Thus we can 
arrange the elements of D(mn) in a table with s rows and t columns, 
where the entry in row i, column j of the table is a; b;; every element 
of D(mn) appears exactly once in this table. To compute o(mn), we 
must add up all entries in this table. We will do this by first adding up 
each row of the table, and then adding these row sums. 

For 1 < i < s, let r; be the sum of row i of the table. Then 
ri = ibi + ajbo +--+ + aibi = aj(b) + bo +--+ + bi) = aio (n). 
Therefore 
o(mn) =r, trates +r, = ayo(n) + aga(n) +--+ +asa(n) 
= (a) td2+--:-+a;)o(n) =al(m)a(n). 
Section 7.5 
2. (a) n=5893, o(n) = 5740, d = 2109. 
(b) c= 3421. 
5. (a) n=17- 29. 
(b) d=257. 
(c) m= 183. 
7. (a) c=72. 
(b) d=63. 
(c) 288. 


(d) o(n) = 144, d= 47, 18. 


J. 


12. 


(a) 


(b) 


(c) 


We use strong induction. Suppose that a is a positive integer, and for 
every positive integer k < a, the computation of X* uses at most 2 log, 
k multiplications. 

Case 1. a = 1. Then X! = X! = X, so no multiplications are needed, 
and 2 log, a = 2log, 1 = 0. 

Case 2. a is even. Then a = 2k for some positive integer k < a, and 
to compute X° we use the formula X° = X* - XK. Let m be the number 
of multiplications used to compute X". By the inductive hypothesis, m 
< 2log, k. To compute X“ we use one additional multiplication (to 
multiply X* by itself), so the number of multiplications is 


m+1 <2log,k + 1 < 2(log,k + 1) = 21og,(2k) = 2 log, a. 


Case 3. a > 1 and a is odd. Then a = 2k + 1 for some positive 
integer k < a, and to compute X° we use the formula X° = X* - X*- X. 
As in case 2, if we let m be the number of multiplications used to 
compute X* then we have m < 2log, k. To compute X° we use two 


additional multiplications, so the number of multiplications is 


Since a € Rə, {a}! ¥ [1],. And since gcd(n, a) = 1, [a], has a 
multiplicative inverse. 
Suppose x € R}. Then 2 < x <n-1and{x}"~! = [1],. Since {0, 1,.. 


., n — 1} is a complete residue system modulo n, there is a unique y 
such that 0 < y<n- 1 and ax = y (mod n), so [a], - [x], = Ly], We 


must prove that y E R. If y = O then [x], = 
[a]; -Lyly = lal}. - [0]; = [0]n, which contradicts the fact that 2 < x < 
n — 1. Therefore 1 < y < n — 1. And [y]}7! = [a]}7! .[x]}7! = 
fa}?! - (1, = [a]! ¥ [1]„. Therefore yrs = 1 (mod n). It follows 


thaty4#1,so2<y<n-1l. 

Suppose f(xy) = fx) =y. Then lal, i [Xun = Vn . lal, i [x] SO 
[xi]; = [a];' - [y]; = [*2]n, and therefore x, = x. 

By part (b), R, has the same number of elements as Ran(f). Since 
Ran(f) S R, R» has at least as many elements as R4. So at least half 


the elements of R are in Rb. 


Chapter 8 


Section 8.1 


1. (a) Define f: Z* > N by the formula f(n) = n — 1. It is easy to check 


(b) 


(b) 


that f is one-to-one and onto. 
Let E = {n E Z| n is even}, and define f: Z > E by the formula f(n) 
= 2n. It is easy to check that f is one-to-one and onto, so Z ~ E. But 
we already know that Z* ~ Z, so by Theorem 8.1.3, Z* ~ E, and 
therefore E is denumerable. 
(a) No. Counterexample: Let A= B = C = Z* and D = {1}. 
No. Counterexample: Let A= B = N, C= Z, and D = Ø. 
(a) We prove that Vn E NVm € NC, ~ I, > n = m) by induction 
on n. 

Base case: n = 0. Suppose that m € N and there is a one-to-one, 
onto function f: Ip > Im. Since n = 0, I, = Ø. But then since f is onto, 
we must also have I, = Ø, som =0 =n. 

Induction step: Suppose that n € N, and for all m EN, if I, ~ Im 


then n = m. Now suppose that m € N and I,,,, ~ Im. Let f: [n+ > Im 


be a one-to-one, onto function. Let k = f(n + 1), and notice that 1 < k 
< m, so m is positive. Using the fact that f is onto, choose some j < n 
+ 1 such that f(j) = m. 

We now define g: I, > Im- as follows: 


fi), ifi #yJ, 


si) = te 
k, if; =f. 


We leave it to the reader to verify that g is one-to-one and onto. By 
the inductive hypothesis, it follows that n =m-1,son+1=m. 


(b) 


8. 


(b) 


10. 


12. 


(a) 
(b) 


Suppose A is finite. Then by the definition of “finite,” we know that 
there is at least one n E N such that I, ~ A. To see that it is unique, 


suppose that n and m are natural numbers, I, ~ A, and Ip, ~ A. Then 
by Theorem 8.1.3, In ~ Im so by part (a), n = m. 
(a) We use induction on n. 
Base case: n = 0. Suppose A € I, = ©. Then A = Ø, so |A| = 0. 
Induction step: Suppose that n E N, and for all A S I, A is finite, 
|A| < n, and if A # I, then |A| < n. Now suppose that A S I„+1. If A = 
[+1 then clearly A ~ I„+1, so A is finite and |A| = n + 1. Now suppose 
that A # I,,,. If n+1 € A, then A S I, so by the inductive 
hypothesis, A is finite and |A| < n. If n + 1 E A, then there must be 
some k € I, such that k € A. Let A’ = (A U {k}) \ {n + 1}. Then by 
matching up k with n + 1 it is not hard to show that A’ ~ A. Also, A’ 
© [,, so by the inductive hypothesis, A’ is finite and |A’ | < n. 
Therefore by exercise 7, A is finite and |A| < n. 
Suppose A is finite and B € A. Let n = |A], and let f. A > I, be one- 


to-one and onto. Then f(B) © I,, so by part (a), f(B) is finite, |f(B)| < 
n, and if B Z A then f(B) Z I„ so |f(B)| < n. Since B ~ f(B), the desired 
conclusion follows. 
Hint: Define g: B > I, by the formula 

g(x) = the smallest: € /, such that f(i) = x, 
and show that g is one-to-one. 
Notice first that either i + j — 2 or i + j — 1 is even, so f(i, j) is a 
positive integer, and therefore f is a function from Z* x Z* to Z*, as 


claimed. It will be helpful to verify two facts about the function f. 
Both of the facts below can be checked by straightforward algebra: 


For all j E Z*, f, j+ D- f4, j) =j. 
For all i E Z* andj E Z*, 1, i +j - 1) < Ki, j) < fd, i + j). It follows 
that i + j is the smallest k E Z* such that f(i, j) < fG, k). 


To see that f is one-to-one, suppose that f(i,, j4) = fio, j2). Then by 
fact (b) above, 


i; + jı = the smallest k € Z* such that f (i), j) < f, k) 
= the smallest k € Z* such that f(i2, j2) < f(1, k) 
= i2 + j2 
Using the definition of f, it follows that 
(i+ 71-2) + ji!) 
2 
(i2 + j — 2)(i2 + j — 1) 


5 


— 


i = fli, ji) — 


FG, j2) — 


i 


N 


But then since i, = i, andi, + jį =i, + jọ, we must also have j, = j», so 
(i, j1) = (i>, j2). This shows that f is one-to-one. 


To see that f is onto, suppose n € Z+. It is easy to verify that f(1, n 
+ 1) > n, so we can let k be the smallest positive integer such that f(1, 
k) > n. Notice that f(1, 1) = 1 < n, so k > 2. Since k is smallest, f(1, k 
— 1) <n, and therefore by fact (a), 


O<n—f(l,k—-1) < fUl,k) — f(,k-1)=k-1. 
Adding 1 to all terms, we get 
l<n— f(lk—1)4+1 <k. 


Thus, if we leti=n-f(1,k-1)+1then1<i<k. Letj =k —i, and 
notice that i E Z* and j E Z*. With this choice for i and j we have 


S 
_CG+tj-Dli+j D; 
2 
(k — 2)(k — 1) 


(k — 2)(k — 1) (k — 2)(k — 1) 
E > a E +l=n. 


— — 


fi, j) 


15. 


(a) If B\{f(m)|m E Z*,m <n} = Ø then B € {f(m)|m € Z*, m < 
n}, so by exercises 8 and 10, B is finite. But we assumed that B 
was infinite, so this is impossible. 


(b) We use strong induction. Suppose that Vm < n, (m) > m. Now 


(c) 


17. 


19. 


suppose that f(n) < n. Let m = f(n). Then by the inductive hypothesis, 
f(m) > m. Also, by the definition of f(n), m = fn) E B\ {fk) | k E 
Z*,k<n} S B\ {f(k) |k E Z*, k< m}. But since f(m) is the smallest 
element of this last set, it follows that f(m) < m. Since we have f(m) = 
m and f(m) < m, we can conclude that f(m) = m. But then m € B \ 
{f(k) | k E Z*, k < n}, so we have a contradiction. 

Suppose that i E Z*, j © Z*, and i # j. Then either i < j or j < i. 
Suppose first that i < j. Then according to the definition of fG), fG) © 
B \ {f(m) | m € Z*, m < j}, and clearly Ki) E {f(m) |m E Z*, m < j}. 


It follows that f(i) # fj). A similar argument shows that if j < i then 
f(i) 4 fG). This shows that f is one-to-one. 


To see that f is onto, suppose that n € B. By part (b), (n + 1) 2> n + 
1 > n. But according to the definition of f, f(n + 1) is the smallest 
element of B \ {f(m) | m E Z*, m < n + 1}. It follows that n € B \ 
{f(m) |m E Z*,m <n + 1}. But n € B, so it must be the case that 
also n E {f(m) | m E Z*,m < n + 1}. In other words, for some 
positive integer m < n + 1, m) =n. 
Suppose B € A and A is countable. Then by Theorem 8.1.5, there is a 
one-to-one function f. A > Z*. By exercise 13 of Section 5.2, f I B is 
a one-to-one function from B to Z*, so B is countable. (See exercise 7 
of Section 5.1 for the definition of the notation used here.) 
Following the hint, we recursively define partial orders R„, for n E N, 
so that R= Ry ERER, S: and 


Viel,Vj € Z* ((aj,a;) € Ry V (aj,aj) € Rn). (*) 


Let Rọ = R. Given R,, to define R,,, we apply exercise 2 of Section 
6.2, with B = {a; | i © I,4,}. Finally, let T = (pen Rn. Clearly T is 
reflexive, because every R, is. To see that T is transitive, suppose that 


J2. 


(b) 


24. 


(a, b) E T and (b, c) E T. Then for some natural numbers m and n, (a, 
b) E Rm and (b, c) E R, If m < n then Rm S R,, and therefore (a, b) 
€ R, and (b, c) E R, Since R, is transitive, it follows that (a, c) © 
R, & T. A similar argument shows that if n < mthen (a, c) E T, so T 
is transitive. The proof that T is antisymmetric is similar. Finally, to 
see that T is a total order, suppose x E A and y € A. Since we have 
numbered the elements of A, we know that for some positive integers 
m and n, x = a,, and y = a,. But then by (*) we know that either (a 
a„) Or (An, Am) is an element of R,,, and therefore also an element of T. 
(a) We follow the hint. 


Base case: n = 0. Suppose A and B are finite sets and |B| = 0. Then 
B = Ø, so A x B= Ø and |A x B| = 0 = |A| - 0. 

Induction step: Let n be an arbitrary natural number, and suppose 
that for all finite sets A and B, if |B| = n then A x B is finite and |A x 
B| = |A| - n. Now suppose A and B are finite sets and |B| = n + 1. 
Choose an element b € B, and let B' = B \ {b}, a set with n elements. 
Then A x B =A x (B' U {b}) = (A x B')U(A x {b}), and since b É B', 
A x B' and A x {b} are disjoint. By the inductive hypothesis, A x B' 
is finite and |A x B’ | = |A|-n. Also, it is not hard to see that A ~ A x 
{b} — just match up each x € A with (x, b) E A x {b} —soA x {b} is 
finite and |A x {b}| = |A|. By Theorem 8.1.7, it follows that A x B is 
finite and |A x B| = |A x B'| + |A x {b}| = |A]-n +A] = |A| - (n + 1). 

To order a meal, you name an element of A x B, where A = {steak, 

chicken, pork chops, shrimp, spaghetti} and B = {ice cream, cake, 

pie}. So the number of meals is |A x B| = |A| - |B| = 5:3 = 15. 

(a) Base case: n = 0. If |A| = 0 then A = ©, so F = {@}, and |F |= 1 
=0!. 

Induction step: Suppose n is a natural number, and the desired 
conclusion holds for n. Now let A be a set with n + 1 elements, and 
let F = {f | fis a one-to-one, onto function from I+; to A}. Let g: I,+4 
> A be a one-to-one, onto function. For each i E I +4, let A; = A\ 
{g(i)}, a set with n elements, and let F; = {f | f is a one-to-one, onto 
function from I, to A;}. By the inductive hypothesis, F; is finite and 
|F; | = n!. Now let F! = {f € F | f(n+1) = g(i)}. Define a function 


h : Fi > F! by the formula h(f) = f U {(n + 1, g(i))}. It is not hard to 
check that h is one-to-one and onto, so F; is finite and 
|F| = |F] = n!. Finally, notice that F = Uien, Fi and 
Vi € Ingi¥i € Ingili # j > F! N F; = Ø). It follows, by exercise 21, 
that F is finite and|F| = Bead [Fi] =(n+1)-n!=(n+1)!. 

(b) Hint: Define h: F > L by the formula h(f) = {(a, b) E A x A| ft (a) 
< f! (b)}. (You should check that this set is a total order on A.) To 
see that h is one-to-one, suppose that f E F, g E F, and fz g. Let i be 
the smallest element of I, for which fi) # g(i). Now show that (f (i), 
g(i)) E hA but (f (i), g(i)) E h(g), so h(f) # h(g). To see that h is 
onto, suppose R is a total order on A. Define g: A > I, by the 
formula g(a) = |{x E A | xRa}|. Show that Va E AVb E A(aRb = 
g(a) < g(b)), and use this fact to show that gt € F and h(g’) = R. 

(c) 5!=120. 

27. Base case: n= 1. Then 1, = {1}, P = {{1}}, and Ay,; = Aj. Therefore 
Use, Ail = |Arl and E sep (DHAS = (-1)7|Aqyl = |All. 

Induction step: Suppose the inclusion-exclusion principle holds for 
n sets, and suppose A,, A>, . . . , Ap+1 are finite sets. Let P, = ACI) \ 
{©} and Phi = AUy+1) \ {Ø}. By exercise 26(a), exercise 23(a) of 
Section 3.4, and the inductive hypothesis, 


| U al=|(Ua) uae 
+1Ansil—|( U Ai) A Anti 


1Eln+] iél, 


=|Uai 


iél, iél,, 
= Ev! Ast + Anil = | UAA Anen). 
SEP, iél,, 


Now notice that for every S E P, 


(\Ai N Anyi) = (N Ai) N Anti = Asu(n+1}- 


ieS ieS 


Therefore, by another application of the inductive hypothesis, 


| Jai N An+1)| = >. (-1)P FA sume. 


fel, SE Pn 
Thus 
| © Aj| = Yo (=D As] +. |An+1| ay X (=DE YA suns! 
(El y+ Se Pn SE Phn 
= SoD As| + (-1)7 Apt)! 
SE Pn 


+ XO (= DIYA A sunyi). 
SEP, 
Finally, notice that there are three kinds of elements of P,,,,: those 
that are elements of P,,, the set {n + 1}, and sets of the form S U{n + 
1}, where S E P,. It follows that the last formula above is just 
È sep, (DPH As], as required. 


Section 8.2 


1. (a) By Theorem 8.1.6, Q is countable. If R \ Q were countable then, 
by Theorem 8.2.1, Q U (R \ Q) = R would be countable, 
contradicting Theorem 8.2.6. Thus, R \ Q must be uncountable. 


(b) Let A = {V2 +n | n e Z+}. It is not hard to see that A and Q are 


disjoint, since ,/2 is irrational, and A is denumerable. Now apply 
Theorems 8.1.6 and 8.2.1 to conclude that A U Q is denumerable, 


and therefore AUQ ~ A. Finally, observe that R = (R\ 
(AUQ))U(AUQ) and R \ Q = (R \ (A U Q)) UA, and apply part 2 of 
Theorem 8.1.2. 

5. Suppose that A ~ A(A). Then there is a function f: A > A(A) that is 
one-to-one and onto. Let X = {a E A | a È f(a)} E AA). Since f is 
onto, there must be some a € A such that f(a) = X. But then according 
to the definition of X, a € X iff a € f(a), so X # f(a), which is a 
contradiction. 


10. 


13. 


15. 


(b) 


(c) 


Hint: Define f: A(A)x P(B) > A(AUB) by the formula f(X, Y) = X U 
Y, and prove that f is one-to-one and onto. 
For each positive integer n, let A, = {x E A | x = I/n}. Clearly 
Unez+ An E A. Now suppose x € A. Then x € R*, so x > 0. Let n be a 
positive integer large enough that n => 1/x. Then x = 1/n, so x E A, 
We conclude that A € [) ez+ An, and therefore (J ez+ An = A. 
Suppose dj, A), . . . , ay are distinct elements of A,. Then 
b>ayt+agt::- +a 2 ee ee ee A 
n n n 
so k < bn. Therefore A, is finite, and in fact |A, | < bn. By Theorem 
8.2.2, it follows that A = [J ez+ An is countable. 
Hint: First note that if F = © then g can be any function. If F # Ø, 


then since F is countable, we can write its elements in a list: F = {f}, 


fo, .. .}. Now define g: Z* > R by the formula g(n) = max{|f, (n)|, If 

(n)|,.- +5 If, I. 

(a) If Q is countable, then by part 2 of Theorem 8.2.1, P U Q is 
countable. But P U Q = AZ"), which is uncountable by 
Cantor’s theorem. Therefore Q is uncountable. 

Suppose A € Q. For every n E Z*, An I, S I,, so by exercise 8(a) 

in Section 8.1, A N I, is finite. Therefore S, © P. Now suppose S4 is 

finite. Then there is some positive integer n such that S4 = {A N 4, A 

NI5,...,A1,}. We claim now that A S I,; this will complete the 

proof, because it implies that A is finite, contradicting our assumption 

that A E Q. To prove this claim, suppose that m E A. Then ANI, © 

Sa, so there is some k < n such that AnI,, = An, S I, S I. But m E 

A N Im so we conclude that m E I,, as required. 

Suppose A E Q, B E Q, and A # B. Then there is some positive 

integer n such that either n E A and n È Born € B and n Ẹ A. We 

will assume n € A and n €& B; the proof for the other case is similar. 

We claim now that S, NSg S {Anh, Anh, ..., ANI,-1}; this will 


complete the proof, because it implies that S4 N Sg is finite. To prove 
the claim, suppose that X E S4 N Sg. Then there are positive integers 
na and ng such that X = AN /,, and X = BO Ing. If na 2 n then 


nEeANh, =X =BNh, CB, 


ng = 


which is a contradiction. Therefore ng < n, so X =A ^N Ia E {An Å, 
... A N Ip-1}, as required. 


(d) IfA E Q then S, CP, so since g: P > Z*, g(S4) S Z*. Also, since 


Sa is infinite and g is one-to-one, g(S4) is also infinite. This proves 
that F © A(Z*) and every element of F is infinite. To see that F is 
pairwise almost disjoint, suppose X, Y E F and X ~ Y. Then there are 
sets A, B E Q such that X = g(S,) and Y = g(Sz). Since X # Y, A # B, 
so by part (c), S4 N Sz is finite, and therefore g(S, NSp) is finite. By 
Theorem 5.5.2, g(S4 NSp) = g(S,)Ng(Sp) = X N Y, so X and Y are 
almost disjoint. Finally, define h: Q > F by the formula h(A) = 
g(S4). It is easy to check that h is one-to-one and onto, so F ~ Q and 


therefore, by part (a), F is uncountable. 


Section 8.3 

1. (a) The function i,: A > A is one-to-one. 

(b) Suppose A S B and B < C. Then there are one-to-one functions f: A 
> Band g: B > C. By part 1 of Theorem 5.2.5, g ° f. A > C is one- 
to-one, so A S C. 

5. Letg: A > Bandh: C > D be one-to-one functions. 

(a) Since A # Ø, we can choose some dy € A. Notice that g`™t: Ran(g) > 


A. Define j: B > A as follows: 


, g7! (b), ifb € Ran(g), 
jb) = 
ao, otherwise. 


We let you verify that j is onto. 


(b) 


(b) 


10. 


(b) 


(c) 


14. 


Now define F: 4C > 8D by the formula F(f) = h ° f ° j. To see that 
F is one-to-one, suppose that f, E 4C, f, E 4C, and F(f,) = F(f.), 
which means h ° f, °j =he f ° j. Leta E A be arbitrary. Since j is 
onto, there is some b € B such that j(b) = a. Therefore h(f, (a)) = (h 
o fi ° j)(b) = (he fə ° j)(b) = h(f, (a)), and since h is one-to-one, it 
follows that f; (a) = fə (a). Since a was arbitrary, this shows that f} = 
fz- 

Yes. (You should be able to justify this answer with a counter- 

example.) 

(a) Letn be arbitrary, and then proceed by induction on m. The base 
case is m = n + 1, and it is taken care of by exercise 7. For the 
induction step, apply exercise 2(b). 

Unez+ An is an infinite set that is not equinumerous with A, for any n 

€ Z”. In fact, for every positive integer n, Ay < U,,cg+ An. Can you 
find even larger infinite sets? 


(a) Note that E€ © A(Z* x Z*). It follows, using exercise 5 of 
Section 8.1, that E $ A(Z* x Z*) ~ AZ’). 


Suppose f(X) = f(Y). Then X u {1} E AX = fY) = {Yu {1}, (A\ Y) 
U{2}}, so either X U{1} = Y U{1} or X U{1} = (A\ Y) U{2}. But 
clearly 2 É X u {1}, so the second possibility can be ruled out. 
Therefore X U {1} = Y U {1}. Since neither X nor Y contains 1, it 
follows that X = Y. 

Clearly A is denumerable, and we showed at the end of Section 5.3 

that P ~ E. It follows that A(Z*) ~ PAA) S P ~ E. Combining this 

with part (a) and applying the Cantor-Schröder-Bernstein theorem 
gives the desired conclusion. 

(a) According to the definition of function, “R c A(R x R), and 
therefore by exercise 12(b) and exercise 5 of Section 8.1, “R x 
PR x R) ~ A(R). 

Clearly {yes, no} < R, so by exercise 6(c) of Section 8.2 and 
exercise 5, A(R) ~ P{yes, no} < PR. Since we have both ËR < 


A(R) and A(R) < FR, by the Cantor-Schréder-Bernstein theorem, 
RR ~ AR). 

(b) By Theorems 8.1.6 and 8.3.3, exercise 23(a) of Section 8.1, and 
exercise 6(d) of Section 8.2, QR ~ 2*g(zZt+) ~ P(Z+) ~R. 

(c) Define F: C > QR by the formula F(f) = f è Q. (See exercise 7 of 
Section 5.1 for the meaning of the notation used here.) Suppose f © 
C, g E C, and F(f) = F(g). Then f 1 Q=g fF Q, which means that 
for all x E Q, f(x) = g(x). Now let x be an arbitrary real number. Use 
Lemma 8.3.4 to construct a sequence x4, X>, . . . of rational numbers 
such that lim,- œ Xn = x. Then since f and g are continuous, f(x) = 
lim, = 0 Xn = lim, | o g(Xn) = g(x). Since x was arbitrary, this shows 
that f = g. Therefore F is one-to-one, so C < QR. Combining this 


with part (b), we can conclude that CS R. 
Now define G: R > C by the formula G(x) = R {x}. In other 


words, G(x) is the constant function whose value at every real 
number is x. Clearly G is one-to-one, so R < C. By the Cantor- 


Schr6éder-Bernstein theorem, it follows that C ~ R. 
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Summary of Proof Techniques 


To prove a goal of the form: 


1 


(a) 
(b) 


2 


(a) 
(b) 


~P: 

Reexpress as a positive statement. 

Use proof by contradiction; that is, assume that P is true and try to 
reach a contradiction. 

P > Q: 

Assume P is true and prove Q. 

Prove the contrapositive; that is, assume that Q is false and prove that P 
is false. 

PAQ: 

Prove P and Q separately. In other words, treat this as two separate 
goals: P, and Q. 

PN Q: 

Assume P is false and prove Q, or assume Q is false and prove P. 

Use proof by cases. In each case, either prove P or prove Q. 

P -e Q: 

Prove P > Qand Q > P, using the methods listed under part 2. 

VxP(x): 

Let x stand for an arbitrary object, and prove P(x). (If the letter x 


already stands for something in the proof, you will have to use a 
different letter for the arbitrary object.) 


4xP(x): Find a value of x that makes P(x) true. Prove P(x) for this value 
of x. 


q! xP(x): 


(b) 


Prove AxP(x) (existence) and VyVz((P (y) A P(z)) > y = 2) 
(uniqueness). 
Prove the equivalent statement 4x(P(x) AVy(P(y) > y = x)). 


Vn EN P(n): 
Mathematical induction: Prove P(0) (base case) and Vn E N(P (n) > 


P(n + 1)) (induction step). 
Strong induction: Prove Vn € N[(Wk < n P (k)) > P(n)]. 


To use a given of the form: 


1. 
(a) 
(b) 

Ze 
(a) 


=P: 

Reexpress as a positive statement. 

In a proof by contradiction, you can reach a contradiction by proving P. 
P > Q: 

If you are also given P, or you can prove that P is true, then you can 
conclude that Q is true. 

Use the contrapositive: If you are given or can prove that Q is false, 
then you can conclude that P is false. 

P AQ: 

Treat this as two givens: P, and Q. 

PN Q: 

Use proof by cases. In case 1 assume that P is true, and in case 2 
assume that Q is true. 

If you are also given that P is false, or you can prove that P is false, 
then you can conclude that Q is true. Similarly, if you know that Q is 
false then you can conclude that P is true. 

P -e Q: 

Treat this as two givens: P > Q, and Q > P. 

VxP(x): 

You can plug in any value, say a, for x, and conclude that P(a) is true. 
AxP(x): 


Introduce a new variable, say Xo, into the proof, to stand for a particular 
object for which P(xọ) is true. 

8. A! xP(x): 
Introduce a new variable, say Xo, into the proof, to stand for a particular 
object for which P(xọ) is true. You may also assume that Vy(P(y) > y= 
Xo). 


Techniques that can be used in any proof: 
1. Proof by contradiction: Assume the goal is false and derive a 
contradiction. 


2. Proof by cases: Consider several cases that are exhaustive, that is, that 
include all the possibilities. Prove the goal in each case. 
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Adleman, Leonard, 360 
Alford, W. R., 370 
algebraic number, 388 
almost disjoint, 389 
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arbitrary object, 113 
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for A, 45 
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truth table for, 54 
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closed, 259, 263 
closure, 260, 264, 316 
Cocks, Clifford, 360 
Cohen, Paul, 394 
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for modular arithmetic, 344 
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conclusion, 8, 90 
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conjunction, 10 

truth table for, 15 
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constant function, 238, 247, 249 
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proof by, 102, 105 
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counterexample, 2, 90 
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De Morgan’s laws, 21, 25 
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denumerable set, 375, 394 
diagonalization, 386 
difference of sets, 35 
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truth table for, 15 
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Euclidean algorithm, 327, 331 
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exclusive or, 15, 24 
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existential instantiation, 120 
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Fermat number, 289, 340, 369 
Fermat primality test, 369-371 
Fermat pseudoprime, 369-371 
Fermat witness, 369—371 
Fermat’s little theorem, 357, 369 


Fibonacci, 307 
Fibonacci numbers, 306, 311, 312, 331 
finite sequence, 383 
finite set, 246, 280—283, 289, 292, 322, 372 
fixed point, 259 
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free variable, 29, 59 
function, 229 
compatible with an equivalence relation, 239, 248 
composition of, 234 
constant, 238, 247, 249 
domain of, 233 
identity, 230, 251-256 
inverse of, 249-259 
of two variables, 263 
one-to-one, 240 
onto, 240 
range of, 233, 242 
restriction of, 237, 247, 258 
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fundamental theorem of arithmetic, 335 


Godel, Kurt, 394 

geometric mean, 290 
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graph, 194 

greatest common divisor, 324 

greatest lower bound (g.1.b.), 167, 208, 209 


harmonic mean, 290 
harmonic numbers, 300 
Hilbert, David, 340, 394 


Hilbert number, 340 
Hilbert prime, 340 
hypothesis, 90 


idempotent laws, 21 

identity 
elements in modular arithmetic, 344, 349 
function, 230, 251—256 
relation, 193, 215, 230 

iff, 54 

image, 230, 233, 268-272 

inclusion-exclusion principle, 381 

inclusive or, 15 

increasing, 267 

index, 78 

index set, 78 

indexed family, 78, 79 

induction, 273 
strong, 304, 311 

induction step, 273 

inductive hypothesis, 276, 304 

infinite set, 372 

injection, 240 

instance of a theorem, 90 

integer, 32 

intersection 
of family of sets, 81, 82 
of indexed family of sets, 84 
of two sets, 35, 82 

interval, 378, 396 

inverse 
additive in modular arithmetic, 344, 349 
multiplicative in modular arithmetic, 345, 349, 351 
of a function, 249—259 
of a relation, 183, 191 

inverse image, 268—272 

irrational number, 171, 310, 387 


irreflexive, 214 


key, 360 
public, 360 


largest element, 208 

least common multiple, 337 
least upper bound (l.u.b), 209 
lemma, 219 

Leonardo of Pisa, 307 
limit, 168 

linear combination, 328 
logarithm, 256 

loop, 194 

lower bound, 167, 208 
Lucas, Edouard, 313 

Lucas numbers, 313 


main connective, 17 
mathematical induction, see induction 
maximal element, 208 
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arithmetic, 290 
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Miller, Gary L., 371 
Miller-Rabin test, 371 
Miller-Rabin witness, 371 
minimal element, 203, 280 
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multiplicative function, 354, 358 


nand, 25 
natural number, 32 


necessary condition, 52 
negation, 10 

truth table for, 15 
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null set, 33 
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one-to-one, 240 

one-to-one correspondence, 246 
onto, 240 
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pairwise disjoint, 161, 216 
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periodic point, 322 
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power set, 80, 384 
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prime number, 1, 75, 78, 163—166, 306, 320 
largest known, 5 
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proof, 1, 89 


by cases, 143—147 
by contradiction, 102, 105 
Proof Designer, xi, 128 
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for a given of the form 
P > Q, 108 
P + Q, 132 
P V Q, 143, 149 
P A Q, 131 
JxP(x), 120 
J! xP(x), 159 
VxP(x), 121 
=P, 105, 108 

for a goal of the form 
P > Q, $2, 95, 96 
P + Q, 132 
PV Q, 145, 147 
P A Q, 130 
AxP(x), 118 
J! xP(x), 156, 158 
Vn E N P(n), 273, 304 
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=P, 101, 102 
pseudoprime, 369-371 
public key, 360 
public-key cryptography, 359-371 
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quotient, 305, 313, 325-330, 342 
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range, 183, 191, 233, 242 


rational number, 32, 171, 377, 393, 396 
real number, 32 
rearrangement inequality, 291 
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definition, 294 
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refine, 228 
reflexive, 194 
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antisymmetric, 200 
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compatible with an equivalence relation, 227 
composition of, 183, 191, 192 
domain of, 183, 191 
identity, 193, 215, 230 
inverse of, 183, 191 
irreflexive, 214 
range of, 183, 191 
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